Slashdot Mirror


Ask Slashdot: Best Language To Learn For Scientific Computing?

New submitter longhunt writes "I just started my second year of grad school and I am working on a project that involves a computationally intensive data mining problem. I initially coded all of my routines in VBA because it 'was there'. They work, but run way too slow. I need to port to a faster language. I have acquired an older Xeon-based server and would like to be able to make use of all four CPU cores. I can load it with either Windows (XP) or Linux and am relatively comfortable with both. I did a fair amount of C and Octave programming as an undergrad. I also messed around with Fortran77 and several flavors of BASIC. Unfortunately, I haven't done ANY programming in about 12 years, so it would almost be like starting from scratch. I need a language I can pick up in a few weeks so I can get back to my research. I am not a CS major, so I care more about the answer than the code itself. What language suggestions or tips can you give me?"

465 comments

  1. Python by curunir · · Score: 4, Insightful

    I have a friend who works for a company that does gene sequencing and other genetic research and, from what he's told me, the whole industry uses mostly python. You probably don't have the hardware resources that they do, but I'd bet you also don't have data sets that are nearly as large as theirs are.

    You might also get better results from something less general purpose like Julia, which is designed for number crunching.

    --
    "Don't blame me, I voted for Kodos!"
    1. Re:Python by the+gnat · · Score: 4, Insightful

      the whole industry uses mostly python

      This is certainly the way of the future, not just for gene sequencing but many other quantitative sciences, although a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching. It's best to start with just Python, but eventually some C++ knowledge will be helpful. (Or just plain C, but I can't see any good reason to inflict that on myself or anyone else.)

    2. Re:Python by Anonymous Coward · · Score: 4, Insightful

      Python is the new VB.

    3. Re:Python by Garridan · · Score: 5, Informative

      I use Sage. When Python isn't fast enough, I can essentially write in C with Cython. It's gloriously easy. Have some trivially parallelizable data mining? Just use the @parallel decorator. Sage comes with a slew of fast mathematical packages, so your toolbox is massive, and you can hook it all in to your Cython code with minimal overhead.

    4. Re:Python by Anonymous Coward · · Score: 0

      Absolutely Python.

      NumPy takes care of the array buffer and fast basic math, SciPy has many scientific extensions, Matplotlib provides data visualization, iPython is your notebook of choice, Scikit-* provide faster moving toolkits that further extend SciPy to the cutting edge, and SimPy has symbolic math.

      In aggregate these are known as the SciPy Stack.

      Utilizing multiple cores in Python can be accomplished with JIT compilers (Numba / NumbaPro are the best developed presently), Cython with `nogil` and use of OpenMP, or subprocess management. Many libraries which need this type of behavior already offer it (like Scikit-Learn for machine learning), and iPython has inbuilt cluster management tools.

    5. Re:Python by Anonymous Coward · · Score: 0

      Python is popular.
      R is popular.
      Perl is still in wide use.

      All our new stuff is being written in Go, Julia, Scala, and *gasp* Javascript.

    6. Re:Python by shutdown+-p+now · · Score: 5, Interesting

      a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching.

      The problem with using the mix (when you actually write the C++ code yourself) is that debugging it is a major pain in the ass - you either attach two debuggers and simulate stepping across the boundary by manually setting breakpoints, or you give up and resort to printf debugging.

      OTOH, if Windows is an option, PTVS is a Python IDE that can debug Python and C++ code side by side, with cross-boundary stepping etc. It can also do Python/Fortran debugging with a Fortran implementation that integrates into VS (e.g. the Intel one).

      (full disclosure: I am a developer on the PTVS team who implemented this particular feature)

    7. Re:Python by shutdown+-p+now · · Score: 5, Insightful

      Python is VB done right.

    8. Re:Python by Anonymous Coward · · Score: 0

      Python is a nice language for "quick pick up".
      If you want to take advantage of multiple cores you can launch multiple instances of your program or you can use the multiprocessing module.

    9. Re:Python by shutdown+-p+now · · Score: 1

      You can use Cython for heavy lifting without dropping all the way down to C.

    10. Re:Python by SJHillman · · Score: 5, Funny

      VB is feeding your scrotum to a python.

    11. Re:Python by rwa2 · · Score: 4, Informative

      Yes, I did my master's thesis using simpy / scipy, integrated with lp_solve for the number crunching , all of which was a breeze to learn and use. It was amazing banging out a new recursive algorithm crawling a new object structure and just having it work the first time without spending several precious cycles bugfixing syntax errors and chasing down obscure stack overflows.

      I used the psyco JIT compiler (unfortunately 32-bit only) to get ~100x boost in runtime performance (all from a single import statement, woo), which was fast enough for me... these days I think you can get similar boosts from running on PyPy. Of course, if you're doing more serious number crunching, python makes it easy to rewrite your performance-critical modules in C/C++.

      I also ended up making a LiveCD and/or VM of my thesis, which was a good way of wrapping up the software environment and dependencies, which could quickly grow outdated in a few short years.

    12. Re:Python by dmbasso · · Score: 3, Insightful

      The problem with using the mix (when you actually write the C++ code yourself) is that debugging it is a major pain in the ass

      Only if you don't use the C/C++ code as an independent module, as it should be. If you *must* debug it in parallel, you're designing it wrong.

      --
      `echo $[0x853204FA81]|tr 0-9 ionbsdeaml`@gmail.com
    13. Re:Python by Anonymous Coward · · Score: 3, Insightful

      VB is closed-source trash.

    14. Re:Python by polyphemus · · Score: 1

      +1

      I was using Mathematica in grad school (experimental physics). Great for simple number crunching, but awful for doing anything programmatically interesting, and annoyingly expensive.

      I'm now using Python and loving it.

    15. Re:Python by shutdown+-p+now · · Score: 2

      How do you write C++ code for use from Python such that it's not an independent module?

      Anyway, regardless of how you architecture it, in the end you'll have Python script feeding data to your C++ code. If something goes wrong, you might want to debug said C++ code specifically as it is called from Python (i.e. with that data). Even if you don't ever have to cross the boundary between languages during debugging, there are still benefits to be had from a debugger with more integrated support - for example, it can show Python representations of objects that were passed to your C++ code.

    16. Re:Python by ebno-10db · · Score: 5, Insightful

      Perl is still in wide use.

      Do not use Perl for this. I've been using Perl for 15-20 years, and I love it for "scripting", text processing, etc., but using it for scientific computing sounds like an exercise in masochism.

    17. Re:Python by alexgieg · · Score: 1

      I was using Mathematica in grad school (experimental physics).

      That wasn't the right tool for the task. Mathematica is for symbolic math, not number crunching.

      --
      Conservatism: (n.) love of the existing evils. Liberalism: (n.) desire to substitute new evils for the existing ones.
    18. Re:Python by Monkey-Man2000 · · Score: 1

      a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching.

      That's why more people should check out Cython. It's basically it's own language modeled closely after Python but uses C-like language constructions when optimizations are helpful. I quite liked it when I checked it out, but I haven't been following it much recently.

      --
      This post was generated by a Cadre of Uber Monkeys for Monkey-Man2000 (603495).
    19. Re:Python by mwvdlee · · Score: 1

      Python is suicide inducement done right?

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    20. Re:Python by khellendros1984 · · Score: 1

      I wrote an AES implementation in Perl once. It was horrifically sluggish, and I don't recommend it.

      --
      It is pitch black. You are likely to be eaten by a grue.
    21. Re:Python by Anonymous Coward · · Score: 0

      Don't know what field the parent poster is in, but there are plenty of fields of experimental physics where Mathematica is the right tool (although maybe not the only right tool). This is either because there are existing modules that do exactly what you want or because you are working on things that don't heavily rely on pure floating point operation through put. Modeling that uses some symbolic optimization followed by some numeric works well, where you are not doing enough numerics to be slowed down much by Mathematica sucking at that, but symbolic and adaptive parts save you a lot of effort of writing numeric solvers and integrators for certain cases.

    22. Re:Python by dmbasso · · Score: 1

      How do you write C++ code for use from Python such that it's not an independent module?

      I mentioned C/C++ because of the context, but I was actually referring to the logical aspect of the system. You can (and I know people who do) make spaghetti-code even with, for instance, RPC calls.

      Anyway, regardless of how you architecture it, in the end you'll have Python script feeding data to your C++ code.

      That's the point: you should not depend on that. The C++ code must be thoroughly tested independently of the Python code. You know, the whole high cohesion / low coupling thing.

      If something goes wrong, you might want to debug said C++ code specifically as it is called from Python (i.e. with that data).

      Then you should use that recorded data in your C++ unittests.

      Even if you don't ever have to cross the boundary between languages during debugging, there are still benefits to be had from a debugger with more integrated support - for example, it can show Python representations of objects that were passed to your C++ code.

      Even if your debugger had not that feature, it would still be useful. My point is that a good developer should need your debugger only on special cases (e.g. when you don't have access to all the code).

      --
      `echo $[0x853204FA81]|tr 0-9 ionbsdeaml`@gmail.com
    23. Re:Python by wanax · · Score: 3, Informative

      Sage is okay for small-midsize projects, as is R (both benefit from being free).. on the whole though, I'd really recommend Mathematica, which is purpose-built for that type of project, makes it trivial to parallelize code, is a functional language (once you learn, I doubt you'll want to go back) and scales well up to fairly large data sets (10s of gigs).

    24. Re:Python by RDW · · Score: 4, Informative

      I have a friend who works for a company that does gene sequencing and other genetic research and, from what he's told me, the whole industry uses mostly python.

      I think your friend is mistaken. Though it's essential to know a scripting language, most of the computationally expensive stuff in sequence analysis is done with code written in, as you might expect, C, C++, or Java. Perl and Python are used more for glue code, building analysis pipelines, and processing the output of the heavy duty tools for various downstream applications. R is used heavily for statistics, and especially for anything involving microarrays.

    25. Re:Python by TopherC · · Score: 1

      For numeric code, I find that stepwise debugging is rarely helpful and never necessary. Print statements are my primary tool for spot-checking numbers, data structures, and even for evaluating the general flow of the program. The next tool is to create histograms and other plots of the data you're getting at various stages or calculations. By varying the inputs and seeing the effects on plots (in vague terms), you have a very powerful and underrated diagnostic. The more work you put into analyzing your program's data, the better off you'll be.

      I second the Python/C++ combination, and should add that you can do a lot with numpy and scipy so you may very well not need the C++ side of things.

      I've also written a lot in Fortran years ago, but only because I was working with a ton of legacy code. Compared to C and C++, Fortran is actually more elegant for pure numerical computing. I'm not actually recommending it, mind you, but I also found the main weaknesses of Fortran could be mitigated by writing wrappers in Perl to get data in and out. Java is not actually bad for numerical code either, and it's arguably easier to learn than C++. I guess I'm saying this to point out that it doesn't actually matter much what language you use. Visual basic is a rare exception -- never use that for any reason. :-) And you have to be aware that high-level languages like Python are going to be slow if you have no consideration for what operations are time-consuming. Be particularly mindful of memory allocation / object creation.

    26. Re:Python by Bill_the_Engineer · · Score: 1

      Do not use Perl for this. I've been using Perl for 15-20 years, and I love it for "scripting", text processing, etc., but using it for scientific computing sounds like an exercise in masochism.

      I use Perl (admittedly not by choice) for scientific computing and it is used daily to process very large files of raw instrument data. You can use NYTProf and Inline::C to make necessary speed improvements and the CPAN library is extensive.

      --
      These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
    27. Re:Python by Joce640k · · Score: 0

      Python is VB done right.

      Sorry. Two wrongs don't make a right.

      --
      No sig today...
    28. Re:Python by shutdown+-p+now · · Score: 1

      I don't argue that unit tests and other forms of automated testing should be your first line of defense. I've yet to see a sufficiently large codebase with 100% coverage, though, and that's not even getting into the debate of what "coverage" really means. In practice, no matter how much coverage you have, you will run into issues eventually when running your code on real data in production, where it's all working together - and that's when you'll need to debug it to figure out what exactly is wrong and fix it (and then, of course, write a unit test to cover that case, for future regression testing).

      By the way, unit testing does not necessarily mean language isolation. For example, one of our users is a company that writes most of their code in Fortran, but uses Python to write unit tests for it - the tests themselves are shorter and clearer, Python has a good standard unit testing framework, and Python IDEs support them well, integrating them into the regular coding workflow.

    29. Re:Python by Jane+Q.+Public · · Score: 3, Interesting

      "This is certainly the way of the future, not just for gene sequencing but many other quantitative sciences, although a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching."

      I mostly agree with your conclusion, but for somewhat different reasons. I don't believe Python is "the wave of the future", but rather I'd recommend it because it has been in use by the scientific community for far longer than other similar languages, like Ruby. Therefore, there will be more pre-built libraries for it that a programmer in the sciences can take advantage of.

      I also agree that some C should go along with it, for building those portions of the code that need to be high performance. I would choose C over C++ for performance reasons. If you need OO, that's what Python is for. If you need performance, that's what the C is for. C++ would sacrifice performance for features you already have in Python.

      If it were entirely up to me, however -- that is to say, if there weren't so much existing code for the taking out there already -- I'd choose Ruby over Python. But that's just a personal preference.

    30. Re:Python by Garridan · · Score: 2

      I've used Sage on a supercomputer, chugging through hundreds of gigs of data. Do you know what you're talking about, or are you just recommending the shiny thing that you paid lots of money for?

    31. Re:Python by Princeofcups · · Score: 1

      Perl is still in wide use.

      Do not use Perl for this. I've been using Perl for 15-20 years, and I love it for "scripting", text processing, etc., but using it for scientific computing sounds like an exercise in masochism.

      I have a friend who worked on his PhD in Physics with a combination of C and Perl. The C for the heavy compute intensive number crunching, and Perl for the final correlation and make-it-pretty part. That worked quite well for him, since he's a professor now.

      --
      The only thing worse than a Democrat is a Republican.
    32. Re:Python by Joce640k · · Score: 4, Insightful

      Compared to C and C++, Fortran is actually more elegant for pure numerical computing.

      Unsurprising - that's what Fortran was designed for...!

      --
      No sig today...
    33. Re:Python by shutdown+-p+now · · Score: 3, Insightful

      No, it's a simple language that is easy for beginners to learn. But, unlike VB, it is not horribly designed, and is useful even once you grow out of the beginner phase.

    34. Re:Python by Anonymous Coward · · Score: 0

      VB is feeding your scrotum to a python.

      Well played, you magnificent bastard. Well played.

    35. Re:Python by Anonymous Coward · · Score: 0

      "Just Plain C" is how you get gpgpu and sse/avx to work.

    36. Re:Python by Darinbob · · Score: 1

      It's impossible to have VB done right.

    37. Re:Python by Anonymous Coward · · Score: 0

      although a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching.

      In experimental High Energy Physics (pretty much all of the CERN experiments), we write all the performance critical stuff in C++, and glue it with Python and shell scripts.

    38. Re:Python by columbus · · Score: 2

      There are a lot of good suggestions in this discussion so far.

      I have a few points to add.
      1) compiled language vs scripting language
      In general, any compiled language is going to run faster than any scripting language. But you will probably spend more time coding and debugging to get your analysis running with a compiled language. It is useful to think about how important performance is to you relative to the value of your own time. Are you going to be doing these data mining runs repeatedly? Is it worth spending ten times as many hours getting this thing up and running if by doing so, you can get it to run really fast? If so, than chose a compiled language. You're already familiar with C so that would be a natural choice. If, after consideration, you value your development time more than processing time, stick with a scripting language. You'll probably be able to stand up a working program much faster & you can look for other ways to squeeze out extra performance

      2) Parallelism. Your initial question explicitly said you want to use all 4 cores on a Xeon, but I've only seen 1 response so far that addresses this issue. To get good performance out of multiple cores you may need to re-work your algorithms to split the problem into pieces and crunch them down in parallel. Is your problem one that is easily amenable to parallelization? If yes, then you probably want to start thinking about multi-thread or multi-process programming. If your program will never run on something bigger than 1 server, than you will probably be OK sticking with with single multi-threaded process. I don't have experience in this myself, but I've heard that writing your program in a functional language like Haskell will make it intrinsically easy to parallelize. If you ever think your program is going to run on something bigger than that Xeon server - let's say you're thinking of ramping up to a cluster, than I would suggest building it on top of MPI from the beginning. I've had good results getting something up and running on MPI quickly using a combination of python, NumPy, SciPy and mpi4py.

      Good Luck.

      --
      friends don't let friends teleport drunk
    39. Re:Python by Anonymous Coward · · Score: 0
    40. Re:Python by Anonymous Coward · · Score: 1

      Mathematica is useful for data processing if there is some already existing code that does exactly what you need, or you have a problem that could use benefit from it handling arbitrary functions efficiently by working with them symbolically before doing number crunching. But after having been experienced at programming in Mathematica, I switched to Matlab for data analysis and had no desire to go back. More recently, I've been able to do nearly everything I did in Matlab with NumPy, minus frustrations with a license server.

    41. Re:Python by Anonymous Coward · · Score: 0

      I wrote an AES implementation in Perl once. It was horrifically sluggish, and I don't recommend it.

      You did it wrong. I don't recommend doing it wrong either. BTW, why did you write your own...did none of the existing Crypt::* libraries work?

    42. Re:Python by jdavidb · · Score: 1

      Debugger versus printf debugging is a false dichotomy. Better to have a nice logging system that is implemented in both languages. (But my coworkers think I'm irrational for favoring logging over debuggers, so probably noone will agree.)

    43. Re:Python by shutdown+-p+now · · Score: 1

      It's not really a versus thing - printf debugging is still useful in conjunction with interactive debuggers, especially when there is some support for this combo (e.g. you get some kind of special "debug printf" that outputs to the dedicated debugger window, like Debug.Write in .NET).

    44. Re:Python by Anonymous Coward · · Score: 0

      So, the Python biggots got first post. For years this crowd has been saying that Python is the "way of the future" for X, like they said with Unix shell scripting and Web application frameworks. And they are still at it. Where computation is involved in scientific computing, the vast majority of university courses on science don't teach it with Python, they teach it in R, SAS, even SPSS and Minitab. And don't forget Matlab / Octave. These languages/applications are successful because they were built by people who needed something that made their own SCIENTIFIC work easier for them.

      So, if you want a language for scientific computing, learn one of these FIRST, and then if you are into Python go ahead and use that language and its libraries for your own purposes. But PLEASE, Pythoners, don't go spouting off in public forums about how it is inevitable that Python is going to just completely take over the world of scripting / programming languages (or other such domain) in the next 6 months, and in 2 years all university courses on Your Favorite Problem Domain That Needs Computing will be teaching it in Python.

      You've been saying stuff like this for years, and you don't ever learn. It makes you look stupid, and it makes your pet language look like the the central dogma of a Jonestown cult. Please, grow up!

    45. Re:Python by occasional_dabbler · · Score: 1

      Erm, what's wrong with using good ol' FORTRAN with Python/Numpy/Scipy? Combined with the multiprocessing module I have coded up some very fast multi-core routines in very little time. f2py is almost stupidly easy to use, especially on Linux

      --
      "Our opponent is an alien starship packed with atomic bombs," I said. "we have a protractor"
    46. Re:Python by Anonymous Coward · · Score: 0

      Blerg.
      I know exactly what kind of code "the industry" produces.
      I remember a friend of mine worked on a project to speed up some dna matching software they had for cancer research.
      The goal was to speed up something that took 1 or 2 hours by implementing it in an FPGA.
      Simply by rewriting the code in the same language he already reduced it to minutes rather than hours.

    47. Re:Python by Just+Some+Guy · · Score: 5, Funny

      I wrote some Perl that looked like the output of AES once.

      --
      Dewey, what part of this looks like authorities should be involved?
    48. Re:Python by Mitchell314 · · Score: 1

      I've seen python, perl, and java bioinformatics software. Not much of C or C++. I've used matlab and it's pretty good for doing heavy number crunching and graphing. SAS OTOH is a PITA.

      --
      I read TFA and all I got was this lousy cookie
    49. Re:Python by shutdown+-p+now · · Score: 1

      There's absolutely nothing wrong with using Fortran, with Python or by itself. It's just that if you already know C or C++, it's probably "good enough" for the high-perf part that there isn't much point in learning yet another language.

    50. Re: Python by Anonymous Coward · · Score: 0

      Hive on amazon s2 timeshare rental utilizing almost ansi sql like mysql. Easy to learn and fast for big datasets with analytical needs.

    51. Re:Python by amacbride · · Score: 1

      True, though most aligners are written in C/C++; lots nowadays take advantage of CUDA.

    52. Re:Python by Anonymous Coward · · Score: 0

      ...without spending several precious cycles bugfixing syntax errors and chasing down obscure stack overflows.

      So you are saying that you suck as a programmer?

    53. Re:Python by rahvin112 · · Score: 1

      Some of us like masochism, no give me my whip back.

    54. Re:Python by joetainment · · Score: 1

      >> Python and C++, because numpy/scipy can't do everything

      Yes, definitely true, and it's actually pretty easy to use them together.

      If you don't want to write C++ however, there are a couple other options:

      Cython - basically let's you generate c/c++ by writing Python like code and is very easy to use interacting with Python. It keeps the Cython parts of your code super fast, like straight up C.
      http://cython.org/

      Pypy - a super fast version of Python. If you write Python code yourself, and don't use off the shelf Python stuff, Pypy is crazy fast. (About C speed in my own tests of doing C like things.) Pypy gets slower if you use a lot of other Python code that wasn't written with Pypy in mind, but even then it's still normally much faster than regular Python. Using Pypy, you might just be able to write all the code in it and not have to bother with anything else.
      http://pypy.org/

      Both of these are easy enough that you can be up and running, writing/using new code, same day as downloading.

      Finally, even if you are calling other code from C/C++, there's some new tools to make that easier. CFFI is a good example. It makes calling C/C++ pretty easy. I'm not sure how ready it is for a lot of real world use though.
      http://cffi.readthedocs.org/en/release-0.7/

    55. Re:Python by khellendros1984 · · Score: 1

      It was a school assignment. The professor told us to choose our own language to implement it in. I chose poorly.

      --
      It is pitch black. You are likely to be eaten by a grue.
    56. Re: Python by Anonymous Coward · · Score: 0

      My niece is a doctoral candidate in genetics. She used Python in her master's and doctoral research. Not a programmer but any stretch, but she learned what she needed and became the go-to person for her advisor.

    57. Re:Python by techno-vampire · · Score: 2

      I agree with you that doing the number crunching is best in a language designed for that but I don't think C is the answer because it was primarily designed for systems programming, not numeric. If you really need efficient number crunching, go with FORTRAN, especially as the OP says that he already has experience with it.

      --
      Good, inexpensive web hosting
    58. Re:Python by rwa2 · · Score: 2

      Yep. High level languages such as python are great for letting you focus on the domain-specific task you want to accomplish without spending years learning all the little poorly-documented compiler-specific idiosyncrasies of compilers and preprocessors and template languages. Once you're through the prototyping phase and have your interface definitions and unit tests set up, you can then toss things one module at a time over to one of those software weenies to turn into hand-optimized production code. And they'll probably be happier since they don't have to tax their communications skills talking to project managers while trying to figure out what's going on from a nebulous requirements definition document.

    59. Re:Python by Anonymous Coward · · Score: 0

      If you need speed automagically with a numpy based funtion, try numba

    60. Re:Python by elashish14 · · Score: 1

      As long as his friend isn't writing the actual code himself, then a scripting language should be perfectly sufficient. I know C, C++, Java as well as anyone else I know, and do a considerable amount of development in them, but when I was running simulations in graduate school, I didn't touch C++ once - all I used was Scheme the front-end to my simulation package), Python (glue) and bash (a little more glue). Worked great for me, and I'm a fine developer in any maintstream language.

      --
      I have left slashdot and am now on Soylent News. FUCK YOU DICE.
    61. Re:Python by Anonymous Coward · · Score: 0

      Except that usage of python is increasing quite fast in many fields, and it does get used in actual research, regardless of what courses teach. I would say the ~70 person physics experiment I work on now has 80+% of people writing in python, with most of the remainder using Matlab and IDL because they have too much older code around already in those languages. About maybe 5 or less people in that group use Fortran and C for writing numerics code for the python and Matlab projects, or for embedded code. And I'm one of them, because I grew up with C and that is what I am most comfortable with. Regardless of whether python is the future or should or shouldn't be, it is in heavy use in some fields of science now.

    62. Re:Python by lennier1 · · Score: 1

      True!

      Python seems to be especially popular among mathematicians.
      Thinking back to my university days and how Matlab tended to piss me off at every opportunity, I can't really fault them for that.

    63. Re:Python by sg_oneill · · Score: 1

      VB (We're refering to VB6 here right? I know very little about vb.net except I dont know any coders who use it) wasn't a bad language because it was accessible to beginners. It was a bad language for the same reason PHP is held in disrepute. Its because many of the underlying assumptions are poorly thought out.

      VB encouraged poor type hygiene, made it difficult to abstract business processes in any sort of rational way (You tended to just drop the business logic under the buttons and glue com objects in for functionality) and encouraged a style of programming where people would just draw up the screens and then put in just enough code to make it behave like the spec. This is like old world PHP where people would just mess in logic and presentation, and end up with an awful unmaintainable mess with SQL injections, magic globals and other sorts of horrors.

      Python is indeed accessible to the beginner, but its accessible because its concise and readable. It encourages proper abstraction, has a clean and readable OO system, and a great library of USUALLY well written libraries.

      It has shown a few wrinkles adapting to some more recent ideas such as closure oriented programming (its the wrong choice for that) but its combination of computer-sciency abstraction and accessibility is the reason for its popularity with new coders, NOT its encoding of bad coding work flow.

      --
      Excuse the Unicode crap in my posts. That's an apostrophe, and slashdot is busted.
    64. Re:Python by Anonymous Coward · · Score: 0

      I think your friend is mistaken.

      Why? When you use SciPy to do number crunching, the heavy lifting is being done by compiled FORTRAN and C code. That's not slow.

      The genetics guys might have their own special libraries that do things they need, and again Python can wrap those.

      We aren't talking about starting over from scratch and writing your own scientific library in pure Python. That would be glacially slow (unless, maybe, you used PyPy to run it... PyPy is amazing). Plus, why throw away good libraries that have already been debugged?

      Perl and Python are used more for glue code,

      Right: SciPy is a bunch of FORTRAN and C code glued together in Python.

      R is used heavily for statistics

      I'm not a stats guy or a science guy, but from what I have heard, Python with the Pandas library is a good replacement for R. I hear that R is very good at its particular problem domain, but not so good for general-purpose stuff; while Python with Pandas is pretty good at the same things R is good at, while still being a very usable all-around language.

      http://pandas.pydata.org/

    65. Re:Python by Anonymous Coward · · Score: 0

      Stay with FORTRAN77 dude. The new stuff is for kids.

    66. Re:Python by shutdown+-p+now · · Score: 1

      That was precisely my point. Python is a language that's designed (among other things) to be accessible to beginners, and it's designed right for that. VB is a language with a similar goal (which it arguably reached), but with bad, inconsistent design that promoted numerous bad practices.

    67. Re: Python by tolkienfan · · Score: 1

      Python is easy to learn, flexible and highly descriptive. It doesn't have undefined behavior like C, and there are libraries to assist with almost anything, especially scientific computing. There is even an MPI implementation for python. Just avoid tabs. Personally I use Octave and C depending. But for this purpose, C is too difficult. Not to be insulting - C has all kinds of quirks and is low level. The benefits aren't big enough for many things. Python is much quicker to become productive. Octave is even more descriptive for vectors and matrices, but less in other areas like IO and OO. Either of those would be my advice. Just make sure you vectorise the hell out of it. If you write a for loop you're incurring several orders of magnitude penalty in performance.

    68. Re:Python by Anonymous Coward · · Score: 0

      (Or just plain C, but I can't see any good reason to inflict that on myself or anyone else.)

      1. simplicity (with major caveats depending on what you are doing)
      2. more importantly, portability

      Consider the merits of speaking about stuff you know, instead of spreading your hunches in the company of experts let others talk about the stuff they know well.

    69. Re: Python by tolkienfan · · Score: 1

      I've written some very dense code quickly in Perl. I can't read it now, of course...

    70. Re:Python by Anonymous Coward · · Score: 0

      I've never had Sage "cap out" on me as a project grows larger, and have a Python/Cython/C++ project that easily scales to terabytes.

    71. Re:Python by hyperfl0w · · Score: 1

      I have a friend who works for a company that does gene sequencing and other genetic research and, from what he's told me, the whole industry uses mostly python.

      I work in a Gene Sequencing company and the current debate is: Python vs Clojure vs Scala:
      Python unquestionably has the best bioinformatics support. Period. No debate. But the lack of language features has irked many programmers, myself included.
      Clojure is gaining attention as a pure functional language with an R-like environment for large scale machine learning tasks. (Bioinformatics is machine learning applied to biomedicine). This makes porting Matlab/Octave/R code much easier, or so the thinking goes.
      Scala has its backers. Java is nearly invisible in the bioinformatics world, but the JVM is hard to ignore. Scala has excellent support for Machine Learning but terrible support for "biological and medical applications". Hat tip: you can hire scala programmers or teach Scala to Java programmers in short time


      BIG DATA is a bigger problem for us today than previously:
      "genome wide arrays" used to mean all ~25,000 gene transcripts or 500,000 single DNA changes (SNP).
      These revolutionary technologies are already considered "old".
      The rise in performance and drop of cost of DNA sequencing is much faster than the commodification of CPUs during the internet dot-com race .

    72. Re:Python by hyperfl0w · · Score: 1

      a complete answer would be Python and C++, because numpy/scipy can't do everything and Python is still very slow for number-crunching.

      The problem with using the mix (when you actually write the C++ code yourself) is that debugging it is a major pain in the ass

      +1 "I just spent three days chasing down build error that uses numpy/scipy"

    73. Re:Python by Jane+Q.+Public · · Score: 1

      " If you really need efficient number crunching, go with FORTRAN, especially as the OP says that he already has experience with it."

      That's true but there are a couple of problems with it. First, to this day, FORTRAN's input-output is still an abomination. And second, modern languages like Python and Ruby can import and work directly with C libraries, but as far as I know Fortran does not play nicely with them.

      I agree that efficient number crunching is Fortran's forte. But I suppose it depends on what you are trying to do.

    74. Re:Python by Anonymous Coward · · Score: 0

      I guess you compile portions of your Mathematica code. I did some numerical computing with both it and Fortran-77. Mathematica ran about 10,000 times slower than the equivalent Fortran code (1 minute Fortran-77 = 1 week Mathematica).

    75. Re:Python by Anonymous Coward · · Score: 0

      VB wasn't horribly designed either, it just wasn't designed *for you*. VB was great at what it did, which was let non-CS people design & write simple software. The problem was when management decided it was a solution to *every* development need, which isn't an issue of VB itself, but an issue with the fact that "No Silver Bullet" isn't required reading for anybody who writes software or manages people who write software.

    76. Re:Python by ILongForDarkness · · Score: 1

      Image processing in the biology space at least seems to be a combination of Matlab and applications developed in Java in my experience. Physics was still when I was going through (2004) mainly a C or Fortran land place. A C style language and you won't go wrong. I'd suggest C++ or C# to pull in the object oriented side of things and huge libraries. Computationally extensive data mining: lots of data or lots of post processing required? If it is just lots of data you'll be be limited mainly by disk/network speed and languages will be relatively less important. If it is a lot of crunching you'll probably want a lower level language like C/C++ or a mature VM language Java/.Net as non of the popular scripting languages can touch them for raw performance.

    77. Re:Python by ILongForDarkness · · Score: 1

      Not saying it is right but in my experience in the scientific programming space unit tests essentially never get written. You stare into space and try to figure out what is long with your logic. You use printf statements. But you are hardly getting paid in the first place so spending a bunch of your time writing unit tests for code that you likely will be the only one that uses for ~1 year before you write your thesis just doesn't happen. There are just too many papers, conferences and beers to take part in. There are exceptions of course like if you are working on a project that is generating tools for the field to use (say image processing tool, n-body simulation framework etc) but those jobs are relatively rare compared to the "get data from this piece of equipment and generate a time series of fourier transforms of the data" type things that are essentially one offs).

    78. Re:Python by shutdown+-p+now · · Score: 1

      Like I said, VB was horribly designed, which is orthogonal to it being an easy language to use. Yes, it let non-CS people design & write simple software. There are other, better ways to let non-CS people design & write simple software, which do not involve teaching them bad habits.

    79. Re:Python by Anonymous Coward · · Score: 0

      I'm a grad student who works with data mining too. We all use Python. You should also check out Scikit for machine learning with Python and NLTK if you're doing stuff with natural language. My lab uses Java a lot too, but from a data mining standpoint, that's mostly for Weka - a nice machine learning toolkit.

    80. Re:Python by Anonymous Coward · · Score: 0

      Oh, and of course R. R is terrible to use, but lots of people still do and you'll probably encounter it frequently. Might as well learn. I find using it to be frustrating, but I had to suck it up and do it anyways.

    81. Re:Python by Smerta · · Score: 1

      Is there any Perl that /doesn't/ look like the output of AES? I look back at some of my one-liners from a year ago, and it's like, "WTF?!?!"

    82. Re:Python by Anonymous Coward · · Score: 0

      Well, sure. You want fast, do it in C. Many of the perl Crypt::* libraries are written in C.

    83. Re:Python by khellendros1984 · · Score: 1

      Speed wasn't a requirement. Input for the assignment was something like a 16-byte key, with the message following (or an equivalently-simple arrangement). Output was the decoded message...which was a mixture of ASCII art and cryptography-related quotes.

      --
      It is pitch black. You are likely to be eaten by a grue.
    84. Re:Python by chrismcb · · Score: 1

      Noooo! Not a Formula Translator!

    85. Re:Python by wanax · · Score: 1

      I have little idea what works for supercomputers and highly parallized data analysis (I've never used one).. I work on data sets that tend to have memory bottlenecks, which I think describes a lot of exploratory data analysis activity... and in the framework, I've found one major advantage of mathematica is that I can leave the data intact, while creating a lot of code that accesses it in multiple forms, due to mathematica's ability to process the symbolic instructions before querying the dataset.

      In terms of price of the shiny, I bought my initial license for mathematica for $500.. I've paid on average about $120/year for two licences (work and home) 8 and 6 core respectively.. It's hardly an expense.

    86. Re:Python by stoatwblr · · Score: 1

      Which is why a lot of space science is done using fortran (despite the vast shortcoimings of Gfortran compilers)

      A couple of branches of space science use IDL and other monstrosities for number crunching... ... Which leads to such inanties as someone writing an IDL program to call WGET 200,000 times to fetch files from an archive when one wget command would do the job.

      People write in what they're used to writing. It's very hard to break habits.

    87. Re:Python by Garridan · · Score: 1

      I have little idea what works for supercomputers and highly parallized data analysis (I've never used one).

      Oh ok. So when you said Sage is okay for "small-midsize projects" and recommended Mathematica for large projects? You've only used Mathematica for small projects, and have no idea what a large-scale project is. And your pricing info is woefully outdated if you're recommending this to a new user.

    88. Re: Python by cheesybagel · · Score: 1

      There are libraries for C as well like ATLAS for the Intel MKL. MPI implementations are usually written in C. You are just invoking a C library from inside python.

      I agree that it is easier to write a prototype in python for such an application but I haven't had great experiences writing large python programs...

    89. Re:Python by Anonymous Coward · · Score: 0

      If that's what I had to do, I'd pay the $100 for a MATLAB license. Our math department had a site license for it and doing something like generating a time series of Fourier transforms on a dataset is something you can write up in about a minute. I'd always check my homework answers that way for my Fourier Analysis II class.

    90. Re:Python by ILongForDarkness · · Score: 1

      Depends on the scale of the problem. I worked on things that took ~200k CPU hours back in the day when CPUs were single core. You couldn't sacrifice 2-10X the speed for ease of development. Also, professors were under the opinion of unless you coded it yourself you don't really understand it.

    91. Re:Python by occasional_dabbler · · Score: 1

      The Fortran I'm writing is pretty much pseudo code. It's only the lowest level math stuff that looks exactly the same in any language, but is run thousands of times per loop. All you need of the language itself is subroutine(inputs, output) and you're done.

      --
      "Our opponent is an alien starship packed with atomic bombs," I said. "we have a protractor"
  2. Fortran by Anonymous Coward · · Score: 2, Insightful

    sorry to say, but that is a fact

    1. Re:Fortran by Anonymous Coward · · Score: 0

      This is the correct answer.

    2. Re:FORTRAN by Anonymous Coward · · Score: 2, Insightful

      Yeah, sure.

      So that no one can ever check your models or replicate your results even if you publish code and initial data.

    3. Re:Fortran by shutdown+-p+now · · Score: 2

      It depends on what exactly his computationally intensive part is. It may be something that can be trivially implemented in Python in terms of standard numpy operations, for example, with performance that's "good enough".

    4. Re:Fortran by the+gnat · · Score: 0

      Sure, if you don't care about having your code be maintained or extended by anyone under age 30, don't plan on doing any custom visualization beyond GNUplot, and don't care if you ever find employment outside of academia.

    5. Re:FORTRAN by jythie · · Score: 1

      Was that supposed to be a crack about popularity? Because auditing fortran is no worse then most other languages, and it can be argued that fortran is better then most in terms of being able to validate models.

    6. Re:FORTRAN by Frosty+Piss · · Score: 5, Interesting

      Clearly you are not involved in serious science.

      And if you think FORTRAN is some ancient esoteric languge, you're ignorent as well. The most recent standard, ISO/IEC 1539-1:2010, informally known as Fortran 2008, was approved in September 2010.

      Fortran is, for better or worse, the only major language out there specifically designed for scientific numerical computing. It's array handling is nice, with succinct array operations on both whole arrays and on slices, comparable with matlab or numpy but super fast. The language is carefully designed to make it very difficult to accidentally write slow code -- pointers are restricted in such a way that it's immediately obvious if there might be aliasing, as the standard example -- and so the optimizer can go to town on your code. Current incarnations have things like coarray fortran, and do concurrent and forall built into the language, allowing distributed memory and shared memory parallelism, and vectorization.

      The downsides of Fortran are mainly the flip side of one of the upsides mentioned; Fortran has a huge long history. Upside: tonnes of great libraries. Downsides: tonnes of historical baggage.

      If you have to do a lot of number crunching, Fortran remains one of the top choices, which is why many of the most sophisticated simulation codes run at supercomputing centres around the world are written in it. But of course it would be a terrible, terrible, language to write a web browser in. To each task its tool.

      --
      If you want news from today, you have to come back tomorrow.
    7. Re:FORTRAN by Anonymous Coward · · Score: 0

      Having to read old FORTRAN is not a pleasant experience for someone who figures they can generally read languages without hitting a book of some sort,

    8. Re:FORTRAN by Anonymous Coward · · Score: 0

      See: http://www.ieeeghn.org/wiki/index.php/Oral-History:David_Kuck

    9. Re:FORTRAN by jonesy16 · · Score: 1

      Agreed. There are also OpenMP implementations for doing your parallel processing. If you're running on a Xeon processor then I would SERIOUSLY consider Intel's linux fortran compiler as it will provide the best performance by far.

    10. Re:FORTRAN by Anonymous Coward · · Score: 0

      See also:
      http://techpubs.sgi.com/library/dynaweb_docs/0640/SGI_Developer/books/OrOn2_PfTune/sgi_html/ch07.html#id20090
      https://fs.hlrs.de/projects/craydoc/docs/books/S-3901-72/html-S-3901-72/xdwkledm.html#z948490578memily
      http://books.google.de/books?id=qJmGnpKwPKAC&pg=PA522&lpg=PA522&dq=fortran+loop+reordering+example&source=bl&ots=KgtcZlpxVa&sig=lhPRA_tXjhr7NLoZSMlbKb2hTLM&hl=de&sa=X&ei=PiZgUs6XKpPy7AbOt4DQBQ&ved=0CF0Q6AEwBTgK#v=onepage&q=fortran%20loop%20reordering%20example&f=false

    11. Re:FORTRAN by K.+S.+Kyosuke · · Score: 1

      Before you C++ kids want to tell me something, read up on that Mr Kuck and his optimizers. Fortran optimizers did things about 20 years ago which C++ optimizers still cannot do.

      Such as? (Please understand that I'd opt for Fortran instead of C++ for numerics any day of the week myself. But I think this is mostly a fallacy nowadays - I'm pretty sure the Intel stuff shares a major part between the two compilers.)

      --
      Ezekiel 23:20
    12. Re:FORTRAN by Anonymous Coward · · Score: 0

      Fortran and C++

      Learn the fortran for so you can read and understand the old codes. Then learn C++ so you can rewrite those old codes in a better language.

      If you're doing experiment, learn LabView.

    13. Re:FORTRAN by Frosty+Piss · · Score: 1

      Having to read old FORTRAN is not a pleasant experience for someone who figures they can generally read languages without hitting a book of some sort,

      Having some experience actually writing Fortran code helps...

      --
      If you want news from today, you have to come back tomorrow.
    14. Re:FORTRAN by Anonymous Coward · · Score: 0

      Grandpa, there is so much more on the Web for Python. Libraries like NumPy, and SciPy make scientific computing so much easier! Python is what most folks are using now.

      And the thing is, with Fortran, it was used long ago so the only science libraries it has is for Newtonian physics, Astrology, and some geographical things for positioning - on a flat Earth.

      Anyway, there's a Matlock marathon on and they are serving Banana Pudding in the TV room!

    15. Re: Fortran by Anonymous Coward · · Score: 0

      Indeed.

    16. Re:FORTRAN by shutdown+-p+now · · Score: 1

      It's totally possible to use Python and Fortran side by side. Fortran for heavy computational tasks, Python (with numpy) for glue wrapper code that loads the data and massages it into the desired shape before handing it over to that super-fast Fortran routine, and then visualizes the result

    17. Re:FORTRAN by Obfuscant · · Score: 4, Informative

      Upside: tonnes of great libraries.

      Those great libraries are spread across several different "FORTRAN"s. gfortran. gfortran44. Intel's fortran. f77. f90. PGI pgif90. etc. etc etc.

      Gfortran is woooonderful. It allows complete programming idiots to write functional code, since the libraries all do wonderful input error checking. Want to extract a substring from the 1 to -1 character location? gfortran will let you do it. Quite happily. Not a whimper.

      PGI pgif90 will not. PGI writes compilers that are intended to do things fast. Input error checking takes time. If you want the 1 to -1 substring, your program crashes. PGI assumes you know not to do something that stupid, and it forces you to write code that doesn't take shortcuts.

      So, if you get a program from someone else that runs perfectly for them, and you want to use it for serious work and get it done in a reasonable amount of time so you compile it with pgif90, you may find it crashes for no obvious reason. And then you have to debug seriously stupidly written code wondering how it could ever have worked correctly, until you find that it really shouldn't have worked at all. They want to extract every character in an input line up to the '=', and they never check to see if there wasn't an '=' to start with. 'index' returns zero, and they happily try to extract from 1 to index-1. Memcpy loves that.

      The other issue is what is an intrinsic function and what isn't. I've been bitten by THAT one, too.

      And someone I work with was wondering why code that used to run fine after being compiled with a certain compiler was now segment faulting when compiled with the same compiler, same data. Switching to the Intel compiler fixed it.

      Sigh. But yes, FORTRAN is a de-facto standard language for modeling earth sciences, even if nobody can write it properly.

    18. Re:FORTRAN by shutdown+-p+now · · Score: 2

      It's mainly due to more constraints that Fortran places on data structures, e.g. lack of aliasing, that let the optimizer do a better job.

    19. Re:FORTRAN by Anonymous Coward · · Score: 0

      Most people answering seem to be very young who just picked up whatever's the latest fad. No one's even mention OpenMP, which would give him C's efficiency with ridiculously easy parallelization, or OpenMP for if he needs somewhat more complex concurrency modeling. These two, along with FORTRAN, are all really great options that are also extremely well-known among anyone who does significant scientific computing.

    20. Re:FORTRAN by Anubis+IV · · Score: 2

      Not really. My first job while still green and fresh out of high school was an internship with Lockheed Martin, working on hundreds of thousands of lines of meteorological software code that was used by NASA and was written in FORTRAN. I went in without ever having seen it before in my life, and was able to pick it up easily enough so that I was productive within a couple of weeks. I recall that having the first few columns of each line reserved for special uses threw me off the first time I saw it, as did parsing data, but I got used to it easily enough. I later did that same internship the next summer, took a class on FORTRAN during my time at university, and later, while in grad school, ended up as the Teaching Assistant on that class the very last semester it was ever offered at my university.

      So, I think it's fair to say that I've been exposed to it more than most people in the under-30 crowd, though I've never been at a point where I'd consider it my primary language or my go-to language when I want to get something done. Even so, for what it's designed to do, it's hard to compete with it. I haven't had a reason to use it in at least five years, but were I involved in scientific computing that relied on number-crunching, I'd certainly consider it seriously. To not do so would be foolish, I think.

    21. Re:FORTRAN by Anonymous Coward · · Score: 0

      Oops, I meant MPI instead of the second "OpenMP."

    22. Re:FORTRAN by K.+S.+Kyosuke · · Score: 1

      Haven't the aliasing issues been ameliorated in the newer C/C++ revisions? I'm pretty sure that C99 has at least restricted pointers. Of course, it blows up into your face if you, in fact, do alias when calling such a routine with improper data, but at least it shouldn't block the optimizations.

      --
      Ezekiel 23:20
    23. Re:Fortran by ebno-10db · · Score: 2

      if you don't care about having your code be maintained or extended by anyone under age 30

      1. There are plenty of programmers over age 30.
      2. Someone who is 30 today, likely finished his BSc in 2005. Do you think Fortran was much more popular then?
      3. People under age 30 learn Fortran if they're involved in HPC. It's still widely used, and has advantages over C/C++ (easy, built-in parallelization, etc.).

      don't plan on doing any custom visualization beyond GNUplot

      There are lots of other programs you can use besides GNUplot. In serious HPC graphics are often considered a back end that runs separately from the main program, and sometimes on a different machine.

      don't care if you ever find employment outside of academia

      1. You don't know what his major is - he may care less about putting the programming language du jour on his resume. In fact he specifically said "I haven't done ANY programming in about 12 years ... I am not a CS major".
      2. If you do HPC, Fortran could be a very useful thing to put on your resume. All the more because it's obscure these days.

      Don't think that whatever kind of code you write is the be all and end all of programming.

    24. Re:FORTRAN by ebno-10db · · Score: 1

      When's the last time you tried to read Fortran, and what version was it? I originally learned Fortran IV (aka '66 -- '77 had come out but we didn't have a compiler for it). I cursed and moaned about it all the way. Some years later I had to work on a Fortran program and was dreading it. To my pleasant surprise it wasn't bad, because it was Fortran '90.

    25. Re:Fortran by Anonymous Coward · · Score: 0

      don't care if you ever find employment outside of academia.

      You should learn what is the best language for your current job, not for the next. Like most people, after the first language or two, you should realize you can pick up more languages as needed if the problem you work on changes. And I've seen plenty of Fortran code outside of academia, since if you are going to specialize in writing number crunching code, you probably won't end up going into random business logic programming.

    26. Re:FORTRAN by mbkennel · · Score: 1


      Fortran 95 or 2003 is better than C++ for numerical computation, if you don't need to do complex I/O or interact with data base libraries with C interfaces (and without Fortran interfaces).

    27. Re:FORTRAN by shutdown+-p+now · · Score: 1

      C++ still doesn't have anything like "restricted", though. And in C land, I believe that optimizers still don't make a good use of it, probably because it came relatively late in the game (though I wounder about the newer compilers like Clang which should, in theory, be designed for the new standard from the grounds up).

    28. Re:Fortran by The_Wilschon · · Score: 2

      This is exactly the right answer. Never write code that someone else has already written. If you can compose standard operations to do your calculations, then do so in a high-level language. Spend more time thinking and less time coding. OTOH, if you need to code up something custom and you're REALLY sure that you can't use standard operations to do it, then think again about whether or not you can do it with standard operations. You probably can. But, if you can't, then go with FORTRAN. Or maybe C or even C++. But probably FORTRAN. But even then, code as little as you can in FORTRAN. Don't write the whole thing in FORTRAN. Create small operations, and compose them in a high-level language as if they were the standard operations.

      --
      SIGSEGV caught, terminating

      wait... not that kind of sig.
    29. Re:FORTRAN by Bill_the_Engineer · · Score: 1

      Excellent synopsis. I would like to add that as a whole it is hard to beat the speed of Fortran in scientific computing.

      However if you are like most grad students (or even established scientists) and need to do a quick "back of the napkin" calculations for your thesis, I'd recommend iPython with numpy and matplotlib. It is no where near the speed of fortran, but you more than make up for it in how quickly you can throw a routine together and for one-time or very occasional use speed isn't really a factor. Also nothing prevents you from flushing out the algorithm and writing it in fortran, or even better write the computationally expensive operations in fortran and binding it to python ( I've seen to binding libraries for that purpose ).

      --
      These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
    30. Re:FORTRAN by Anonymous Coward · · Score: 0

      There are FORTRAN compilers for just about any computer you care to name, some of them are even FREE (as in beer)..

      If you can't read a FORTRAN program and understand the logic, then you don't deserve to be called a software developer. There's no pointers, no dynamic memory allocation, no heaps, no linked lists, etc.

      All you got is arrays, variables, and simple constructs.. (I should correct myself.. having written linked lists, heaps, and pointer code *in FORTRAN*, I guess it's possible to write something incomprehensible)

      It's not like A = B * C + D is something difficult to understand..
      And DO I=1,15 is no more complex than for(i=1;i16;i++) is it?

    31. Re:FORTRAN by Anonymous Coward · · Score: 0

      Fortran 95 or 2003 is better than C++ for numerical computation

      WRONG!

      There is nothing that can be done in Fortran that can't be done better in C++. Especially numerical computation.

    32. Re:FORTRAN by steveha · · Score: 1

      Seriously consider FORTRAN

      On the other hand, the basic underpinnings of SciPy are FORTRAN. The BLAS and LAPACK libraries, and other fast and well-understood FORTRAN libraries, are "wrapped" by SciPy.

      http://www.scipy.org/scipylib/faq.html#id12

      Using the IPython notebook, you can work with data sets in an interactive way that FORTRAN won't do. But the number crunching is being done for you at FORTRAN speed because it is compiled FORTRAN code that is doing the work.

      http://ipython.org/notebook.html

      --
      lf(1): it's like ls(1) but sorts filenames by extension, tersely
    33. Re:FORTRAN by Anonymous Coward · · Score: 0

      Upside: tonnes of great libraries.

      That's actually why C++ is going to be the language of the future... it sucks, but it has great libraries.

      Fortran, though? It's great. The responses here are pretty solid. It's a tool. Good for some stuff. Bad for others.

      Seriously though, if you're reading this, go learn Python or Matlab or something like that. They aren't the best, aren't the worst, but you're a beginner and the big tools aren't for you yet. You got a bunch of responses saying otherwise, but you'll waste half your time in grad school if you try to learn how to use everyone's recommendations.

    34. Re: FORTRAN by ZiggyM · · Score: 1

      I agree. Look up about compiler differences between c and fortran. Fortran allows the compiler to better know about things like if a variable can be put in a register or not because maybe some other thread has its address and might change it or whether it can auto paralellize some array processing etc. i love c/c++ and ive done large systems on both but for specific routines fortran rules.

    35. Re:FORTRAN by Obfuscant · · Score: 1

      I recall that having the first few columns of each line reserved for special uses threw me off the first time I saw it,

      That reminds me of more differences between standard FORTRAN compilers. There are compilers that default to free-form input, so your first few columns don' t mean what they used to. There are compilers that default to strict column definitions, so if your line runs past column 72 and it compiles fine with one compiler, don't expect it to compile on any other. And if your free form input runs past column (I forget which) and it compiles fine, another free-form input compiler will barf because it defaults to free-form input lines only shorter than that.

      This is a day when standard packages for things come with a configure script and it is usually as simple as saying "./configure; make; make install" and you've got a working whatever. I have yet to see ANY modeling code written in FORTRAN come anywhere close. Most require hand editing of the Makefile to pick a compiler, and even when the Makefile contains references to which compiler was used by the original author, that compiler on YOUR system will gag on various parts. I've had to hand-edit the Makefile not to change the definition of FC, but to add author-provided functions to the lists of "compile me with the free-format flag" source, or "use 132 character free form lines".

      This truly is job security for anyone who can debug and decipher crash dumps and wants to work in modeling.

    36. Re:FORTRAN by Anonymous Coward · · Score: 0

      I really liked your post, but it still amused me that you spelt ignorant with an 'e'.

    37. Re:FORTRAN by ebno-10db · · Score: 1

      my go-to language

      Bad pun.

    38. Re:Fortran by Darinbob · · Score: 1

      Depends on environment. If you're going to be on a supercomputer, then Fortran compilers tend to do the most optimization out there. Plus Fortran really is designed for number crunching (even modern versions). On the other hand, most people are getting along just find with PCs, and you want to do parallelism Fortran doesn't necessarily help as much.

    39. Re:FORTRAN by Darinbob · · Score: 1

      Fortran is the most popular scientific computing language. There are many people who will check the models and replicate it that way. Plus there are tons of old Fortran programs and libraries waiting to be used (ie, any decent scientist wanting to check the models or results of older programs will need to know Fortran).

      In some sense, Fortran is like Windows. It's not pretty, everyone wants to do something better, but everyone sticks with it anyway because it's the most common and compatible option.

    40. Re:FORTRAN by Anonymous Coward · · Score: 0

      The point is that aliasing is not even an issue with normal Fortran code. Try explaining the semantics of 'restrict' to a scientist who is not a computer science major. Actually, as an exercise, try explaining it to the Slashdot crowd.

    41. Re:FORTRAN by Anonymous Coward · · Score: 0

      Idiotic "troll" / "flamebait".

      Go back to masturbating to tentical porn in your mom's basement.

    42. Re:Fortran by fermion · · Score: 1
      I started off in Fortran, and of course it was the language developed for science and maths. I know that it is still used in some places, but the people I know who have been in the technical industry for a very long time seem to be porting away from it.

      Here is what I see. Clock cycles are very cheap, and programmers, and scientists, are not. The advantages of using relatively low level constructs are just not present with computers that basically waste cycles and memory on eye candy.

      So I would have to say that C or Fortran might be justified if one is going to write really intensive simulations or the like, but otherwise everyone seems to be using Python for casual computing, and from what I do with it is seems to work fine. There are even very good books that teach how to write good scientific code in Python.

      I have written extensive over the years in C, Fortran, Assembly among other languages, and right now pretty much just use Python.

      --
      "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
    43. Re:FORTRAN by Anonymous Coward · · Score: 0

      ...but nothing actually supports the latest Fortran properly, in a bug free an usable state.
      The closest is probably ifort, but writing object oriented code in it, is not pleasant.
      There are also no development environments with code completion, no standard library of useful data structures, an outdated model for memory management, and a legacy of terrible development practices from 35 years ago.
      Nobody should ever write any new code in an derivative of Fortran. Yes, it has built-in support for easy array handling, but the same operations can be done much more nicely, and at very similar speed, with a clever C++ template library, with the added advantage, that this infrastructure can be easily and nicely extended.
      Fortran is a relic

    44. Re:FORTRAN by DanielOom · · Score: 1

      If you like your programs nice and slow (and single-threaded), use Matlab, or Python if you can't afford the fee. For a bit of speed, use C or Fortran. If you're in a hurry, get a couple of heavy GPUs and program these in assembler.

    45. Re:FORTRAN by Anonymous Coward · · Score: 0

      Again agreed, FORTRAN in any flavor for numbers is close to the bare metal you can get.

    46. Re:FORTRAN by loufoque · · Score: 1

      FORTRAN is for scientists that aren't software engineers.
      You cannot do advanced optimizations (you do not have fine control of memory nor codegen) or make parallelism scale well with just FORTRAN. It's just a dumb number crunching language. While FORTRAN is still used for supercomputing, most of the time it's just for black box routines with the rest of the application in C or C++.

    47. Re:FORTRAN by Anubis+IV · · Score: 1

      An undesired one, I assure you. I was trying to figure out a different way to phrase it, just because I knew it would be read that way, but nothing came to mind.

    48. Re:FORTRAN by Anonymous Coward · · Score: 0

      In the aerospace industry, almost all analysis software is written in Fortran precisely because it is easier to verify that it is doing what you wanted it to do. It It's also of interest that OOP is specifically BANNED from aircraft software!

    49. Re:FORTRAN by mbkennel · · Score: 1


      There are lots of things which C++ can do better than Fortran. Graphs, guis, operating systems,drivers, template metaprogramming, compiler message heiroglyphics, segmentation faults, memory leaks, & heisenbugs.

      Numerical computation isn't part of that.

      Part of my paycheck is doing numerical computing with C++. I find the Eigen library and C++ 11 makes things better. F2003 would still be better yet.

    50. Re:FORTRAN by ebno-10db · · Score: 1

      An undesired one

      That's what they all say.

    51. Re:FORTRAN by ebno-10db · · Score: 1

      It's also of interest that OOP is specifically BANNED from aircraft software!

      Everywhere, or just extreme cases like DO-178B Level A? I know most of the F-35's software is written in C++ (though considering how that project is going, that's hardly an endorsement).

    52. Re:Fortran by mjwalshe · · Score: 1

      yes as an older Fortran programmer id jump at a chance to get into HPC and writing fortran again any jobs going in the UK ?

    53. Re:FORTRAN by Anonymous Coward · · Score: 0

      You are an ignorant man. Clearly you do not actually do science with computers.

    54. Re:FORTRAN by gajop · · Score: 1

      Fortran is, for better or worse, the only major language out there specifically designed for scientific numerical computing.

      What about Matlab/Octave?

    55. Re:FORTRAN by Anonymous Coward · · Score: 0

      Clearly you are not involved in serious science.

      No I'm not. So when I try to make sense of a quite simplistic plate tectonics model with a view to make a nice android wallpaper or whatever, I stop just short of tearing all my hair off, because the code is just incomprehensible. Having actually written some Fortran doesn't even begin to help.

      Then I have to resort to "serious science", read up a ton of papers on existing models and write the code from scratch.

      So when someone asks which language should he learn - never Fortran. It is unnecessarily hard to get right, readable and reusable. See, performance isn't the only goal here.

    56. Re:FORTRAN by Anonymous Coward · · Score: 0

      You are a MORON who should stay away from computers.

    57. Re:FORTRAN by Anonymous Coward · · Score: 0

      You know the all-caps spelling has been obsolete for over 20 years, right?

    58. Re:FORTRAN by Anonymous Coward · · Score: 0

      Numerical computation isn't part of that.
      Part of my paycheck is doing numerical computing with C++.

      All of my pay check is doing scientific computing in Fortran and there is nothing Fortran can do that C++ can't do better. Especially numerical computing.

  3. English by Anonymous Coward · · Score: 4, Funny

    Obviously.

    1. Re:English by Anonymous Coward · · Score: 0

      Why do you want him to write it in Cobol?

  4. Try the CS department? by Anonymous Coward · · Score: 0

    Why not trying tracking down a CS professor and getting paired up with an undergrad student who needs to create a capstone project?

    1. Re:Try the CS department? by Anonymous Coward · · Score: 0

      This is the very worst advice I've read all week. First and foremost, it is common wisdom in scientific computing that it's easier to teach a specialized scientists to program than to teach a programmer enough about the scientific field and specialization to make themself useful. Second, there's a hugely important distinction between Computer Scientists and programmers. To put it simple: programmers program - Computer Scientists think about programming.

      Think I'm exaggerating? I'm just rounding off a project where we did just that (collaborating with CS and leave the hard programming part to them), and it was the worst disaster of my career. The Computer Scientists insisted on using hot new technology X and alpha library Y, which were poorly suited for our project, but interesting for them to publish. The result was a 200% time overrun and a final product that was horribly complex and therefore incredibly brittle and probably unmaintainable. We constantly got hit by bugs and instabilities in the "hot new tech" as well as the project itself - getting these borderline inappropriate technologies to work together and perform the task at hand made it look like a Rube Goldberg machine. To make matters worse, the people who did the actual programming just weren't very good at it - very slow and low algorithmic problem solving skills. The PI explained it as 'yes, I know they're the bottom of the barrel, but it's the best we can get in the face of the "brain drain" from industry; we're already paying them almost twice what someone would get for a similar position in a different field of science.' On one occasion, we wanted to have a usability bug fixed. The person who had to do the works just kept on whining for 2 weeks about how difficult it would be, then finally the PI did it himself... in 1 day. All in all, I'm convinced that if I, a scientists, would be able to reclaim the time I spent explaining our requirements, and spend that time on doing the programming myself, I could have done a better job at it all on my own.

    2. Re:Try the CS department? by Anonymous Coward · · Score: 0

      First and foremost, it is common wisdom in scientific computing that it's easier to teach a specialized scientists to program than to teach a programmer enough about the scientific field and specialization to make themself useful.

      I thought it was common wisdom that it is easier to hand a PDE or other mathematical system to a CS type, than it is to train a specialized scientist to do proper numerics code and develop efficient solvers. The last collaboration we did with the CS department was because a solver a coworker wrote was painfully slow, and rather unstable over part of the needed parameter range. We got back a simple to use Fortran code that took a basic text input and output, was over and order of magnitude faster, and actually worked across the full parameter range we needed. I guess your mileage varied.

    3. Re:Try the CS department? by Anonymous Coward · · Score: 0

      That's just a solver - a clean problem that's easy to describe in mathematical terms. That hardly qualifies as Scientific Computing these days - there are ton of very sophisticated solver libraries readily available to suit nearly everyone's needs. Try simulations of reality (whether it's weather systems on earth, wave functions of molecules, nuclear processes inside a star, or proteins in the cells of your gut bacteria) with models that are sufficiently approximate to finish the big task at hand in an acceptable amount of time on the available supercomputers, yet sufficiently accurate to yield results that are of use. Implementing that kind of stuff requires deep insight in the field of science in question. I know of no scientific software in my field that is written predominantly by computer scientists or professional programmers (though they did have some input in some cases).

  5. Universally, and unambiguously by gwstuff · · Score: 0

    Math

  6. FORTRAN by Frosty+Piss · · Score: 2, Insightful

    Seriously consider FORTRAN

    --
    If you want news from today, you have to come back tomorrow.
  7. MATLAB? by Anonymous Coward · · Score: 1

    Have you looked at Matlab? It's commercial, requiring a license, but many universities have a site license available for you to use it. Pretty powerful, faster than VB, but not as fast as native C/C++ but unless you're running some calculations real-time, this probably is not an issue for you.

    1. Re:MATLAB? by golden+age+villain · · Score: 1

      All the labs I know in my field (neuroscience) do most of the data analysis and simulations with MATLAB. It is also used to control hardware for data acquisition.

    2. Re:MATLAB? by Fallen+Kell · · Score: 1

      He already stated he uses Octave which is the open source equivalent to Matlab, it can even read in about 95-98% of all Matlab scripts (the exceptions being a couple of lesser used toolboxes).

      --
      We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
    3. Re:MATLAB? by alexgieg · · Score: 1

      many universities have a site license available for you to use it

      And if not, anyone can download and use its open source clone, GNU Octave.

      --
      Conservatism: (n.) love of the existing evils. Liberalism: (n.) desire to substitute new evils for the existing ones.
    4. Re:MATLAB? by MightyYar · · Score: 1

      MATLAB is an obvious choice unless the license fee becomes an issue. I would caution against doing a GUI in MATLAB, though. (That said, I'm doing one right now...)

      Python is nice, but in my experience fewer people are familiar with scientific computing in Python than in MATLAB. I think if I didn't have that roadblock, I'd use Python (NumPy, SciPy).

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    5. Re:MATLAB? by joe_frisch · · Score: 1

      We use matlab extensively at SLAC. If the license fee isn't a problem, it is a very powerful language for numerical work. The language includes a variety of high performance toolkits with multi-processor, distributed computing and GPU support.

      OCTAVE and the other free versions really do not substitute for matlab - I'd recommend python instead if matlab is too expensive.

      I've used python as well, and it is OK, but IMHO less optimized in structure for numerical work (even with pylab etc) than matlab, but better for other types of programming.

    6. Re:MATLAB? by spike+hay · · Score: 1

      The problem with Octave is that it just generally sucks balls at everything. It's commonly an order of magnitude slower or more than Matlab. It's compatible with simple scripts that only use basic library functions, but if you start talking about plotting and such, it isn't.

      Of course, Matlab is actually a horrible language itself. Everything is a double. One function per m-file (I know, local functions, but you can't call them from the outside). It's really sad that everyone is so locked into it.

      Julia is the best choice, IMO. R and Numpy/Scipy are also loads better.

      --
      If you don't understand any of my sayings, come to me in private and I shall take you in my German mouth.
  8. Try Julia.. by Anonymous Coward · · Score: 0

    ..seems pretty self-explaining to me.
    http://julialang.org/

  9. Step one: export to a database? by xxxJonBoyxxx · · Score: 1

    >> I initially coded all of my routines in VBA because it 'was there'.

    Are you in Access? Or Excel?

    If your routines work but are just slow, I'd first look at moving the data to SQL Server and porting your VBA routines to VB.NET.

    If you have more time, you may want to learn what the "Hadoop" world is all about.

    1. Re:Step one: export to a database? by K.+S.+Kyosuke · · Score: 1

      If he wrote it in VBA, I'm pretty sure he can rewrite it into a native extension of some kind and use it from the same environment. Some industries love those since you expose the functionality to many users who want or need to work in that user environment.

      --
      Ezekiel 23:20
    2. Re:Step one: export to a database? by Anonymous Coward · · Score: 0

      If you have more time, you may want to learn what the "Hadoop" world is all about.

      Not to be confused with Hardocp

    3. Re:Step one: export to a database? by Anonymous Coward · · Score: 1

      If you have more time, you may want to learn what the "Hadoop" world is all about.

      That's the absolute wrong answer to the question as posed. Hadoop is all about massive parallelization. It's the answer to "How do I throw hardware at a difficult problem?" instead of "How do I solve a difficult problem efficiently."

      I'm not saying that Hadoop isn't extremely useful, but for a student who managed to scrounge up a single Xeon machine, it's entirely ill suited.

    4. Re:Step one: export to a database? by xxxJonBoyxxx · · Score: 1

      >> Hadoop isn't extremely useful, but for a student who managed to scrounge up a single Xeon machine, it's entirely ill suited

      Go back and read the problem again: "would like to be able to make use of all four CPU cores"

      Here's a guy seeking parallelization...and may not know that you don't have to throw big (potentially expensive) multicore processors against the problem - he could throw multiple (cheaper?) computers against it.

    5. Re:Step one: export to a database? by shutdown+-p+now · · Score: 1

      "All four CPU cores" is a scale that's way too small to even begin considering Hadoop.

      Don't use Hadoop - your data isn't that big.

  10. More details? by schneidafunk · · Score: 3, Informative

    Depending on your needs, R may be your best bet if it is statistical processing you are interested in.

    --
    Some people die at 25 and aren't buried until 75. -Benjamin Franklin
    1. Re:More details? by Bovius · · Score: 4, Informative

      Second this. There are numerous languages out there that are tailor-made for specific kinds of problems. You didn't quite share enough to narrow down what kinds problems you need to solve, but the R project is geared toward number crunching, albeit with a significant bent toward statistics and graphic display.

      http://www.r-project.org/

      If that's not pointed in the right direction, some other language might be. Alternatively, there are a lot of libraries out there for the more popular languages that could help with what you're doing. Heck, 12 years ago we didn't even have the boost libraries for C++. It's difficult for me to imagine using that language with out them now.

    2. Re:More details? by Vesvvi · · Score: 1

      R is terrific but it's also horribly overused (much like Excel). In my experience R is best when called for very specific calculations within the context of a larger package written in something more general-purpose.

    3. Re:More details? by Anonymous Coward · · Score: 0

      Thirded. There are just soooo many really good libraries available on CRAN and the community is awesome!

      The sqldf library in combination with csv-files imported from Excel saved me from learning that rotten piece of sh**** that VBA for Excel is. My collegues insist on having raw data in spreadsheets for "compatibility" and are looking suspiciously when I've solved yet another data structure problem they have in a few minutes of programming in R - which would take them hours of mousing around and hacking spreadsheets.

      R rocks.

    4. Re:More details? by plopez · · Score: 1

      +1 also. To elaborate. I used R all through my thesis. Nice tool; both gratis and libre, powerful graphing capability, and a host of packages from a huge number of scientific and technical fields. Lots of code examples and a good community.

      --
      putting the 'B' in LGBTQ+
    5. Re:More details? by Anonymous Coward · · Score: 1

      That's funny. It makes sense, but for statisticians it is exactly the opposite: R is just a really crappy general-purpose environment for which several special-purpose libraries have been written, covering everything under the sun. Without the libraries, you'd be crazy to use it these days.

    6. Re: More details? by jonnyj · · Score: 2

      R is by far the best solution that I've found for statistical analysis and data mining. It's ugly, inconsistent, quirky and old fashioned but it's absolutely brilliant.

      The whole syntax of R is based around processing data sets without ever needing to worry about loops. Read up on data tables - not data frames - in R and you'll learn how to filter data, aggregate it, add columns, perform a regression and beautifully plot the results all in one line of code. The Zoo package will sort out your time series analysis and longitudinal analysis. With R, you can calculate the statistical significance of you hypotheses and apply the model you've developed to your hold-out sample using built-in functions. And the concept of workspacecd means that you don't need to think of funky ways to store your interim results.

      Using knitr, R will produce publication quality documents and presentations. ggplot will give you the best data visualisation tools in the business.

      R is the tool that has been purpose-built for the task in front of you. Anything else might be easier to learn or more widely supported - but it won't be as effective.

    7. Re:More details? by Anonymous Coward · · Score: 0

      Yes, especially if you have lots of data and are doing some sort of non-standard thing that doesn't have a c++ library.

    8. Re:More details? by Anonymous Coward · · Score: 0

      R is a great language. it is ingenious, though rather programmer-unfriendly. it wants to be too interactive, and so has weird inconsistent syntax and semantics at times. worse, its error messages are indecipherable and misleading. and R does not throw an error if you use an undefined variable in a data frame or list. the core team has stopped on major changes. over time, it has become somewhat dysfunctional. this sounds negative, because it's so frustratingly close and yet so far...

  11. And my answer is... by Anonymous Coward · · Score: 0

    Java (for quick prototyping), C++ (port from Java code/structure to fine-tune performance).

    Check with your potential employers what language(s) was(were) used to build their current applications, and what languages (if any) they will port to.

  12. What are you doing? by RichMan · · Score: 3, Informative

    What do you mean by scientific computing?

    Modelling: Hard core finite element simulations or the like. Then C or Fortran and you will be linking with the math libraries.
    Log Processing: A lot of other stuff you will be parsing data logs and doing statistics. So perl or python then octive.
    Data Mining: Python or other SQL front end.

    1. Re:What are you doing? by Anonymous Coward · · Score: 0

      Second that. We teach C for introductory scientific computing, and modern variants of Fortran are very slick. If it's performance you're after, those two running on a unix/linux architecture is the only game in town.

      Microsoft and high performance computing should never be used in the same sentence.

    2. Re:What are you doing? by UnknowingFool · · Score: 3, Informative

      Well if your problems require statistical computing, R is the language to use. For general scientific computing, the last I checked Octave was still valid. As for multi-core processing only a few languages and compilers support platforms like Open MP. Fortran, C, and C++.

      --
      Well, there's spam egg sausage and spam, that's not got much spam in it.
    3. Re:What are you doing? by TheCarp · · Score: 2

      It sounds like you are saying a more specific version of what I was going to post.

      A little research goes a long way and libraries may be more important than language. I don't care how nice the language is.... the less underlying mechanisms I need to implement, and the faster I can get into the meat of what I am working on, the better.

      If you want to do RSA encryption in your code (for example) your best bet is NOT to pick a language where you can't find an RSA implementation (Applesoft basic? lol not sure what that would be these days) and implement your own.

      Sure its not too bad, but any mistakes could sink you, and it means debugging and supporting yet more code....when you could be using a standard library that lots of other people use and has already had most of the kinks worked out, and gets updated on its own.

      Base languages are all exceedingly similar when you strip away the syntactical sugar. Its the varying quality of the different sections of their libraries that really set them apart in different areas.

      --
      "I opened my eyes, and everything went dark again"
    4. Re:What are you doing? by shutdown+-p+now · · Score: 2

      Well if your problems require statistical computing, R is the language to use.

      A lot of people seem to be pretty happy with Python+pandas lately.

      (and the advantage of going the Python way is that it's also a general purpose language that's useful elsewhere)

    5. Re:What are you doing? by s.petry · · Score: 1

      I second this selection for the same reasons. Many people dislike having to code for simple Unix commands, so favor a "scripting" language like Python or Perl because it's easy. Performance is never easy, and massive number crunching requires real compiled code not pseudo code. R, C, C++ are all exceptional for number crunching. Hell, I would put Pascal up against Python any day of the week for pure number crunching, and Pascal's syntax for math is very easy to learn.

      --

      -The wise argue that there are few absolutes, the fool argues that there are no probabilities.

    6. Re:What are you doing? by nabsltd · · Score: 1

      That mix pretty much matches what we use where I work.

      We have a 1200-core HPC cluster, used mostly for biomedical applications (gene sequencing, simulations, etc.), so it's not huge by any means, but it isn't tiny either. Our latest toys are 32-core boxes with 1.5TB of RAM.

    7. Re:What are you doing? by Anonymous Coward · · Score: 0

      a few languages and compilers support platforms like Open MP. Fortran, C, and C++.

      Those aren't platforms. You mention one library and three languages.

    8. Re:What are you doing? by UnknowingFool · · Score: 1

      The question is how to take advantage of multi-core. The problem is at what scale do you program this functionality. Certainly you can code for multiple threads, however, it is somewhat manual to code to determine how best to use the cores. Right now he has only one server. What if he gets a second? He could use a platform like OpenMP and let the platform assist. There will always be some tweaking; the question is whether you want to tackle it all yourself or get assistance. I'm not an expert of the perils of multi-core processing. I would rather get some help. If you want a platform, the you have to bear in mind which languages and compilers you can use.

      --
      Well, there's spam egg sausage and spam, that's not got much spam in it.
  13. IPython Notebook + Python Data Analysis Library by rla3rd · · Score: 3, Informative

    Install these 2 and you'll be good to go
    http://ipython.org/notebook.html
    http://pandas.pydata.org/

  14. Python by Anonymous Coward · · Score: 1

    Try Python. Make sure to use scipy (numpy really), because you don't want to do the heavy lifting in native Python.
    http://www.scipy.org/

  15. what the rest of your team uses by peter303 · · Score: 4, Insightful

    You should all be sharing your codes to avoid rewriting and to perfect it.
    And if you are not a member of a team then I seriously question the quality of your graduate program.

    1. Re:what the rest of your team uses by Anonymous Coward · · Score: 0

      I've been on smaller teams though where what each person is doing is pretty independent from each other. For example, some plasma experiment where each person is in charge of a specific diagnostic or two. Each person will use the results from other diagnostics, but they just ask that person to give them a file with the final data in it, since they want it vetted by the operator of the diagnostic anyway. Without any need to actual run each other's code (or to do so through passing simple text files), and if common libraries cover your basic needs, there isn't much to gain from using the same language as each other unless you one person is going to mentor you with programming. YMMV depending on the project and team structure.

    2. Re:what the rest of your team uses by Anonymous Coward · · Score: 0

      And if you are not a member of a team then I seriously question the quality of your graduate program.

      I do not understand... Please explain.

      What has participation in a 'team' to do with an Individual earning a degree? I am not aware of 'teams' defending a 'group dissertation' ...

      I am sure that You will find this question absurd, but it is the result of honest puzzlement.

  16. BAD TIM! BAD! by girlintraining · · Score: 5, Funny

    What language suggestions or tips can you give me?"

    Timothy, shame on you. You should know better than to start a holy war.

    --
    #fuckbeta #iamslashdot #dicemustdie
    1. Re:BAD TIM! BAD! by Anonymous Coward · · Score: 0

      Can not mod this up enough.
      Your turn!

    2. Re:BAD TIM! BAD! by Anonymous Coward · · Score: 0

      Seriously. Why even ask? I mean everybody knows that *Insert programming language you know best here* is clearly the most superior programming language ever developed. Question answered.

    3. Re:BAD TIM! BAD! by ImdatS · · Score: 1

      Shakespeare http://en.wikipedia.org/wiki/Shakespeare_(programming_language) - my favorite language...

      (and yes, this is supposed to be funny)

    4. Re:BAD TIM! BAD! by ebno-10db · · Score: 1

      Why is the PP marked "funny". It sounds like a serious concern to me.

    5. Re:BAD TIM! BAD! by sensei+moreh · · Score: 1

      Actually, using the programming language you know best may indeed be the most efficient way to go - depends on the problem

      --
      Geology - it's not rocket science; it's rock science
  17. Julia lang seems to match your needs. by Anonymous Coward · · Score: 0

    But from what I heard, it's still in development. Does someone know how usable it is atm?

    1. Re:Julia lang seems to match your needs. by spike+hay · · Score: 1

      I'm using Julia for most of my work. It's very usable. There's good libraries for most common tasks now, and for anything that there isn't, Python functions can be called (with automatic conversion to Numpy arrays, etc) with no wrapper with a package called PyCall. This is a feature that Julia developers should really make more noise about.

      For example, suppose I want to do a simple plot of a two vectors x and y:
      ________________
      using PyCall #import the module
      @pyimport matplotlib.pyplot as plt #import the Python module itself
      plt.plot(x,y)
      plot.show
      -------------
      That's it. A window with the plot shows up, exactly as it would in python.

      Apart from that, Julia has many innovative features. A good type system and multiple dispatch make for a very elegant and fast language. As well, like lisp, code is data like any other, allowing for useful macros (the @pyimport is actually a macro call). I used to use Matlab, and then Python/Numpy, but a really haven't looked back after moving to Julia.

      --
      If you don't understand any of my sayings, come to me in private and I shall take you in my German mouth.
  18. Are you asking for permission to use fortran? by Anonymous Coward · · Score: 0

    Cause that probably the answer if your having "computation performance problems", maybe even C++/OpenCL if your feeling really brave...

    On the other hand, why not just throw more hardware at the problem (or wait a little longer). By the time you have recoded your VBA in something else, i'm betting the VBA code could have solved the problem running on some decent hardware.

    Unless the question is "I wrote my code in VBA and it doesn't scale to a 5k node cluster, what did I do wrong". In that case you aren't really asking the right question.

  19. Fortran (plus MPI and some CUDA) by Anubis350 · · Score: 1

    Fortran and learn some how to implement MPI and CUDA code is your work is parallelizable.

    --
    "goodbye and hello, as always" ~Prince Corwin, from Zelazny's Amber series
    1. Re:Fortran (plus MPI and some CUDA) by GiganticLyingMouth · · Score: 1

      For completeness, it should also be noted that both C and C++ work with MPI and CUDA. Fortran can theoretically be faster than C or C++ as its compiler can optimize more aggressively (due to the lack of pointer aliasing in Fortran), but I don't have any hard data for how much of a difference it would make in actual runtime speeds.

    2. Re:Fortran (plus MPI and some CUDA) by Anonymous Coward · · Score: 2, Insightful

      Fortran and learn some how to implement MPI and CUDA code is your work is parallelizable.

      DO NOT USE CUDA

      Use OpenCL

    3. Re:Fortran (plus MPI and some CUDA) by afidel · · Score: 1

      The biggest deciding factor is usually what are the standard libraries for your field available in, in many disciplines your base libraries might only be available in Fortran.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    4. Re:Fortran (plus MPI and some CUDA) by nabsltd · · Score: 1

      Better still, learn about the Intel Phi and let the Intel compiler do the parallelization, all without having to learn a different architecture and API.

    5. Re:Fortran (plus MPI and some CUDA) by loufoque · · Score: 1

      A FORTRAN compiler can generate better code than a C compiler for dumb code, the difference in that with C you have the option to optimize your code yourself and actually do a better job than even the FORTRAN compiler.

    6. Re:Fortran (plus MPI and some CUDA) by GiganticLyingMouth · · Score: 2

      Last I checked (a few years ago) CUDA had better tools and more features than OpenCL. Has this changed much since then? OpenCL didn't even support templates back then...

  20. perl ftw by Anonymous Coward · · Score: 0

    Perl should handle literally anything you can throw at it.

  21. 2 paths by johnjaydk · · Score: 3, Informative

    If you can find anything that resembles a math library with the correct tools then go with Python. Numpy is everyones friend here.

    If you have to do the whole thing from scratch then Fortran is the fastest platform. I can't say I've meet anyone who enjoyed Fortran but it's wicked fast.

    --
    TCAP-Abort
    1. Re:2 paths by the+gnat · · Score: 1

      If you have to do the whole thing from scratch then Fortran is the fastest platform. I can't say I've meet anyone who enjoyed Fortran but it's wicked fast.

      True, but the only place where this *really* matters is programming for repetitive calculations on massively parallel supercomputers. For anything else, there is a tradeoff between program speed and developer speed, and ultimately it's cheaper to buy more computers than hire more programmers.

    2. Re:2 paths by Anonymous Coward · · Score: 0

      First you're assuming the problem can be broken out in parallel, second you're assuming the programmer is getting paid.

    3. Re:2 paths by ebno-10db · · Score: 1

      ultimately it's cheaper to buy more computers than hire more programmers

      We're talking about grad students - seriously different budgetary concerns.

    4. Re:2 paths by mbkennel · · Score: 1


      And assuming the acquisition, configuration, installation and parallelization of the servers has zero cost in time.

      The supervisor of two researchers finds that one gets the job done with very fast code in Fortran 95 in a reasonable time, the other gets a demo up real quick but it is far too slow to be used for practical data reduction, and wants to buy 50 servers to spawn it out with some new package (which he hasn't used yet and is bleeding edge freeware).

      "Well, if sequestration is repealed you can write a justification in the grant renewal in 9 months, and if it gets approved (funding rates are under 7% these days at NSF) we might have money for 2 in 18 months. You could rent some on Amazon on your own money, I guess but we have to check with legal about data security requirements."

      Personally, I want to get my research done now.

    5. Re:2 paths by Anonymous Coward · · Score: 0

      If you can find anything that resembles a math library with the correct tools then go with Python. Numpy is everyones friend here.

      If you have to do the whole thing from scratch then Fortran is the fastest platform. I can't say I've meet anyone who enjoyed Fortran but it's wicked fast.

      Um yes it is stupid fast if you want more speed just write in *best language* optimize by the 10% that does the work and use assembler. best of both worlds. I would use (what I am best at) for all other code then FORTRAN if I was afraid of assembler.

    6. Re:2 paths by Anonymous Coward · · Score: 1

      Yes. Rephrase the cost of computers in packets of ramen.

    7. Re:2 paths by Anonymous Coward · · Score: 0

      Except that a grad student is supposed to have a limited amount of time to wrap things up. Within the first year or two you might get used as labor toward a bigger goal. But after that, with several advisors including the one I had, if you told them you were going to optimize something or tweak something without expectations of massive gains, they respond with "We can just get you a better computer, you should be getting started/finishing your thesis, don't waste your time on polish if it won't change the end result."

  22. Fortran, R, Matlab, C, Python or Perl, and ??? by Anonymous Coward · · Score: 0

    There's nothing wrong with Fortran or C. There are newer and in some cases more focused languages you may want to check into like R, Matlab, C++, Python, Perl, or Go. I'm not a fan of the language and it's not known for raw performance compared to Fortran or C,but there are probably great libraries for what you need in Java.

  23. Java Java! by Latent+Heat · · Score: 3, Interesting
    For research engineering, I use Java to run the numerical examples of the algorithms I develop although most of the authors in the journals I publish in are using Matlab for this purpose (ewwwwww!). Long time ago I was a Turbo Pascal person as were engineering colleagues who crossed over to Matlab seeking the same kind of ease-of-use. Me, I transitioned to Delphi but now I am with Java and Eclipse -- the Turbo Pascal of the 21st century.

    For numeric-intensive work, I can get within 20% of the speed of C++ using the usual techniques -- minimize garbage collection by allocating variables once, use the "server" VM, perform "warmup" iterations in benchmark code to stabilize the JIT. I use the Eclipse IDE, copy and paste numeric results from the Console View into a spreadsheet program, and voila, instant journal article tables.

    1. Re:Java Java! by ebno-10db · · Score: 1

      For numeric-intensive work, I can get within 20% of the speed of C++

      Did you measure it? No snark - I'm curious.

      I would like to see some benchmarks that support the claim that Java is almost as fast as C/C++ for number crunching. I'm in the C/C++ camp, but am willing to openly entertain what you get from good benchmarks. These benchmarks show C++ consistently beating Java.

      What those benchmarks don't cover, but I'd love to see, is a comparison of pure C++, pure Java, Java w/ some wrapped C/C++ (I understand there are some nice packages like that), and NumPy/SciPy/Cython (we know straight Python is glacial compared to Java).

      P.S. This is not necessarily an answer to the author's question, as he isn't clear (and may not be clear himself) about how speed critical his stuff is.

    2. Re:Java Java! by Atzanteol · · Score: 4, Funny

      I tried out those benchmarks myself.

      Java:
      $ time java nbody 50000000
      -0.169075164
      -0.169059907

      real 0m8.863s
      user 0m8.820s
      sys 0m0.016s

      Not too shabby. But checkout the C++ times!
      $ time ./nbody.gpp-7.gpp_run
      Segmentation fault (core dumped)

      real 0m0.097s
      user 0m0.000s
      sys 0m0.000s

      OMG that's a ton faster!

      --
      "Ignorance more frequently begets confidence than does knowledge"

      - Charles Darwin
    3. Re:Java Java! by Atzanteol · · Score: 1

      Doh! I'd forgotten the parameter... PEBKAC. I now return you to your regularly scheduled doldrums.

      --
      "Ignorance more frequently begets confidence than does knowledge"

      - Charles Darwin
    4. Re:Java Java! by ebno-10db · · Score: 1

      How many iterations of those tests did you run, and were you using the server version of the JVM?

      Like I said earlier, I'm in the C/C++ camp, but I'm just asking what a Java advocate would probably ask. They're forever talking about "warm up" time, which seems like a big technological step backwards to me. I thought computers used transistors now instead of tubes.

    5. Re:Java Java! by Atzanteol · · Score: 1

      The "warm-up time" is allowing the JIT to gather statistics and optimize the compiled code on-the-fly. The first pass will be slower than the nth pass over the same code. At a certain threshold (number of executions) the JIT will perform more optimizations and the more often the same bit of code is executed the more the JIT learns about how to optimize it. The idea is to not pay the compilation/optimization penalty unless it's worth it.

      Keep in mind that bytecode is platform independent. Each JIT will make different optimizations depending on the local system. But if you stop/restart you pay that penalty again (interesting note: Dalvik for Android saves the compiled versions for later runs).

      Out of curiosity I've run the 'nbody' test on my system with C++ and Java for varying numbers of iterations.

      For 1,000,000,000 iterations C++ took 105.39 seconds and Java 174.15 seconds. That puts Java at about 60% the performance of the C++ in this case. I'm running a 1.8 JVM (still experimental) but I believe it should be representative.

      I'm in the Java camp myself and while I'd like it to perform better the reality is that it simply doesn't. But the reality for me and my line of work is that it easily performs "well enough" and provides enough other benefits to be worth the trade-off.

      --
      "Ignorance more frequently begets confidence than does knowledge"

      - Charles Darwin
    6. Re:Java Java! by ebno-10db · · Score: 1

      I'm in the Java camp myself and while I'd like it to perform better the reality is that it simply doesn't. But the reality for me and my line of work is that it easily performs "well enough"

      I appreciate that "well enough" often applies in the real world, and if you have a lot of Java, know it well, etc., it would be silly to switch (though Scala sounds more interesting, is interoperable, and about the same speed). I also believe you can speed up Java number crunching a lot by using some of the packages where C/C++ has Java wrappers.

      I understand the whole warmup thing w/ the JVM, but it's simply a drawback (even if it doesn't matter much on servers).

      The only thing that irks me is when Java advocates say it "can be faster, is just as fast, or within a hair's breadth of C/C++". I'd love to see some benchmarks that show that, but no one ever seems to produce them.

      'D' might be a good number crunching language, though I haven't tried it. I'll be the first to admit that C++ is a Rube Goldberg, C is primitive, and both give you all the rope you need to hang yourself with (though I rarely find that a problem w/ number crunching). D is kind of C++ done right, and it's a lot safer. It's come a long way, though it still has some rough edges. For instance, it still uses a "stop the world" garbage collector, though there's at least an alpha of a decent generational GC. "The Computer Language Benchmarks Game" that I often cite dropped D for an interesting reason. To reduce the work of maintaining the site they had to drop some languages, and amongst others, chose D because it was so similar to C++.

    7. Re:Java Java! by mjwalshe · · Score: 1

      And how close to a full on FORTRAN program making use of MPI and CUDA? for a scientific computing problem C++ has all that OO overhead

    8. Re:Java Java! by ebno-10db · · Score: 1

      for a scientific computing problem C++ has all that OO overhead

      Only if you choose to use the OO part of it. C++ is very flexible in that regard. Looking for speed? Templates can be very helpful.

    9. Re:Java Java! by BlazingATrail · · Score: 1

      lol did anybody notice the Segmentation fault.. that pretty much sums up the cost of C++ Real programs that execute many iterations, warm up quickly in Java and it's all compiled to assembly by JIT/Hotspot. We've done many benchmarks and in production numerical code Java is just as fast as C/C++

    10. Re:Java Java! by Atzanteol · · Score: 1

      As I understand it there are *very* specific use-cases where Java has a speed advantage - and that's in memory allocation on the heap. But no - in general for number-crunching it will not be as fast as C/C++. And I agree that Java advocates don't do themselves any favors by pretending that it is.

      Garbage collection these days is pretty smooth - unless you're short on memory. But there is that overhead of memory usage that will always be there... I was interested in D at one point but it seems to have just gone nowhere.

      In my opinion the thing that C++ needs isn't just a GC but a standard library that is as complete as what Java and C# have. C++11 seems to have done some work there (though I haven't checked it out too much as my world is Java/C# these days) but it should have been done *ages* ago.

      --
      "Ignorance more frequently begets confidence than does knowledge"

      - Charles Darwin
    11. Re:Java Java! by Anonymous Coward · · Score: 0

      In the last three benchmarks on the page you linked, Java runs at 29%, 27%, and 6% lower than C++, and in only two of the eleven benchmarks did Java take more than twice the time of C++.

      This benchmark has Java faster than C++, although to be honest I don't understand the algorithm they're using. http://code.google.com/p/scalalab/wiki/JavaFFTvsNative

      On the other hand, if you look more carefully at the benchmark page you linked, the quick overview chart doesn't provide accurate information on comparative amounts of memory used. If you look at the actual peak amount of memory used, the C++ versions of the benchmarks used less than 10% of the memory used by the Java versions in more than half the benchmarks. So the Java Virtual Machine, when "warmed up" (running a program long enough that the just-in-time optimizations kick in) is getting awfully close to C++ for performance, but the nature of a virtual machine and automatic garbage collection make it much harder to compete on memory use. Depending upon what kind of work you're doing, that difference could be enough to force you to pick C++ over Java.

    12. Re:Java Java! by CraterGlass · · Score: 1
      Two more good reasons for java: Strong typing and array boundary checking.

      For a scientist who spends more time doing science than being a programmer, boundary conditions and bad type casts are a permanent trap waiting to snap on you. A few years back I had a contract to convert a large mathematical application from Fortran to java. You'd be surprised how many simple errors of this nature had been lurking in the Fortran code for decades. Incorrect mathematical results are not always obvious.

      Another good reason - arbitrary depth of precision. Java has built in classes for numbers with very high precision, e.g. BigDecimal.

    13. Re: Java Java! by tolkienfan · · Score: 1

      You get a performance benefit from preallocating in both C++ and Java, so the whole gc being smooth is irrelevant. If you are constantly allocating and deallocating you haven't written optimal code. Current compilers (applies to jit too) do a terrible job of vectorising for simd instruction sets. Take Intel avx2. That's 512 bit wide operations... e.g 16 simultaneous floating point multiplies. Java can't get close to hand optimized vector code in C plus assembly. Similarly for gpgpu... Of course, OP wanted to make an improvement without learning something entirely new... but if we're comparing C and Java...

    14. Re: Java Java! by Anonymous Coward · · Score: 0

      Only if the C implementation you're comparing to isn't optimal...

    15. Re:Java Java! by hyperfl0w · · Score: 1

      Honest question: what do you do when you need Math / ML support? Java is astonishingly vacant -- only Weka and Mahout -- but no packages provide logistic regression, variation measures, distance functions, distribution functions, etc.

      How do you do MATH with Java?

    16. Re:Java Java! by mjwalshe · · Score: 1

      But still dont have a lot of the builtin primitives useful for scientific computer which is C and C++'s big limitation for the uber HPC type of aplications pklus you dont have decades of existing code and librarys that fortran will have.

    17. Re:Java Java! by ebno-10db · · Score: 1

      I agree that Fortran is still the champ for fast number crunching, but I was specifically refuting your point that C++ necessarily carried the overhead of OO.

    18. Re: Java Java! by Atzanteol · · Score: 1

      True - Java basically implements a transparent memory pool for the developer. It's nothing that *couldn't* be done with C/C++ (indeed, it *is* with the JVM!). But with Java you get it "for free." And that also ignores stack allocation in C/C++ which is much faster anyways and impossible in Java. So there aren't really any performance benefits with Java that can't be matched by C/C++ (albeit with a bit more effort). But if performance were all that mattered why do we have any other languages other than assembly?

      Nothing is going to beat hand-tuned perfectly optimized assembly - but you hit diminishing returns on effort/performance very quickly. If you're trying to brute-force a crypto algorithm by all means spend a ton of extra time hand-optimizing that inner-loop in assembly that is specific for your processor. If you're calculating a sum of somebody's paychecks for the last year from a database you're probably fine with spending a few minutes writing it in Python.

      Oh - and I agree that it would be great if compilers could take better advantage of SIMD operations. Their use is "situational" though and seems a tough job to predict when they should be used. It seems to me that a JIT may be better positioned to make use of them though given the run-time analysis that can be performed. I'm not a compiler guy though so it's very likely I'm wrong. :-)

      --
      "Ignorance more frequently begets confidence than does knowledge"

      - Charles Darwin
    19. Re: Java Java! by tolkienfan · · Score: 1

      I completely agree. In scientific computing, although, vector math is ubiquitous. So I'd almost discount current Java implementations from scientific computing. At least for the "hot spots"

  24. Rather than learning a language... by Anonymous Coward · · Score: 1

    I would recommend learning what a programming language is. Especially if you have the time. Personally I spent a lot of time learning languages and not really seeing the abstraction that every programming language adhere's to, making learning a new language difficult and time consuming. I can only really describe it as trying to learn a language rather than learning linguistics. All computer languages share common patterns all based on formalism, just like all spoken languages share common patterns. Learning formalism makes picking up new programming languages much easier since you'll not only be able to identify patterns shared between them faster, but pick up the lexicon to communicate well formed questions to other programmers. I'd recommend reading Structure and Interpretation of Computer Programs. There are other books that attempt to replicate what this does, but it really is great and I haven't seen other books get to the point of computer programming faster. It is based on LISP, which most people will never use, but its deceptively easy to read and understand, so getting through the book for someone that hasn't used LISP before shouldn't be a problem. Good Luck!

  25. Python, or ... by Kiliani · · Score: 2

    First suggestion: Python. Lot's of nice stuff for science (NumPy, SciPy), lots of other goodies, easy to learn, many people to ask or places to get help from. Plus you can explore data interactively ("Yes Wedesday, play with your data!").

    Beyond that: CERN uses a lot of Java (sorry folks, true), they have good (and fast) tools I do a project right now where I am using Jython since it is supported by the main (Java) software I have to use. I like jhepwork/SCaVis quite a bit, if you are into plotting stuff on Java.

    If you have extra free time and want to learn how to program well? I'd learn something like Smalltalk (for OOP concepts) and/or Haskell (functional programming). Scientists are often lousy programmers because they often do not learn programming properly, and/or the language allows them to get away with bad programming (I know, every language allows bad programmers to write bad code, but some make it easier than others).

    So, stick with Python, it works really well, is modern, and has good support. Plus you can read your code in 5 years time ...

    What do I program in? Python (and Jython), Perl, C, IDL (yickes!), Smalltalk, Matlab, Mathematica. I know some Lisp, but that's just for fun. And whatever allows me to load sketches on an Arduino. I like Python (get's stuff done) and Smalltalk (works actually like I think - passing messages between objects).

    Use whatever works and you don't hate :-)

    --
    Do your own thing. And overdo it!
    1. Re:Python, or ... by SerenelyHotPest · · Score: 1

      If you have extra free time and want to learn how to program well? I'd learn something like Smalltalk (for OOP concepts) and/or Haskell (functional programming). Scientists are often lousy programmers because they often do not learn programming properly, and/or the language allows them to get away with bad programming (I know, every language allows bad programmers to write bad code, but some make it easier than others).

      This is extremely well-intentioned advice and very correct in big-picture terms but not at all appropriate for cold-starting the ability to read and write simple programs for numerical scientific work. If the OP intends to become a programmer specializing in scientific computing and must manage a substantial code-base, this will prove good advice to take as some tertiary measure; if his/her goal is to become proficient enough to write a few hundred lines of code for a cross-component patch in some specific scientific context, it's frankly a waste of time--and I say this as an apologist for learning both Smalltalk and Haskell (along with possibly Scheme) if you intend to call yourself a computer programmer and/or especially a computer scientist. I don't deny understanding several paradigms of programming is invaluable for any programmer; I'm just saying that if you treat your time as precious and programming as one small part of what you do, this isn't the best use of your time.

  26. R-language by biodata · · Score: 4, Informative

    Most of the cutting edge data mining I've seen is done using R (which acts as a scripting wrapper for the C or Fortran code that the fast analysis libraries are coded in), or alternatively in python. Some people swear by MatLab if they have trained in it (so your octave would come in handy there). Have a look at some discussions at places like kaggle.com to see what the competitive machine learning community uses (if that is what you mean by data mining).

    --
    Korma: Good
    1. Re:R-language by green+is+the+enemy · · Score: 2, Insightful

      This is the correct advice: Use whatever language is most common in your research area, so you can benefit from the most existing source code. This will almost certainly be a high-level scripting language like R, MATLAB or Python, with the ability to drop down to C, FORTRAN and CUDA for the small parts of the code that need optimization. (In my case: electrical engineering = MATLAB + C and CUDA mex files)

    2. Re:R-language by T.E.D. · · Score: 1

      Dang, I had mod points this morning, but not now.

      This is in fact the Right Answer. Scientific Computing folks have started to cast about for Fortran alternatives in the last few years, and most of the talk I have heard has centered on R. So I'd go with a newer Fortran compiler (for God's sake, not 77 or earlier) and/or R.

      This is most definitely a field where you want to follow the herd, unless you happen to have an interest in esoteric CS things like code optimization, cross language binding generation, and mathematics library debugging, to go along with your interest in Scientific Computing. If you strike out on your own, that's the kind of stuff that will be consuming your time.

    3. Re:R-language by hyperfl0w · · Score: 1

      R is great until you look under the hood at the package implementations.
      "A good implementation beats a good design "

  27. Go (aka Golang) if you come from a C background by genghisjahn · · Score: 1

    http://golang.org/ You won't regret it.

    --
    Sorry about the mess.
    1. Re:Go (aka Golang) if you come from a C background by K.+S.+Kyosuke · · Score: 1

      Could use some vectorizing FP, but yeah, it's not a bad choice, especially if the complexity of mixed environments is undesirable. (Could also use some native port of netlib/GSL as well, though.)

      It might also make him a better practical software engineer, which, as I understand, is an area in which many numerics people...experience certain difficulties.

      --
      Ezekiel 23:20
  28. Profile by Arker · · Score: 5, Insightful

    A lot of people will propose a language because it is their favorite. Others because they believe it is very easy to learn. I will give you a third line of thought.

    I would not look for a language in this case, I would look for a library, then teach myself whatever language is easiest/quickest to access it. I would try to profile what you are building, figure out where the bottlenecks are likely to be (profiling your existing mockup can help here but dont trust it entirely) and try to find the best stable well-designed high performance library for that particular type of code.

    --
    =-=-=-=-=-=-=-=-=-=-=-=-=-=-
    Friends don't let friends enable ecmascript.
    1. Re:Profile by jythie · · Score: 1

      I am not sure how much that helps since unless the person is doing something very specific, chances are it will just shift the problem into 'which library is best' debate, which will again mostly involve people suggesting libraries they like or because they believe they are easy to learn.

    2. Re:Profile by Anonymous Coward · · Score: 1

      Python is recommended because it has two very fast and powerful computational libraries written by scientists for number crunching. People analysing vast amounts of data do not want to be fucking around with bottlenecks, they want to get on with their jobs.

    3. Re:Profile by CQDX · · Score: 1

      Unless you are doing something completely novel, don't reinvent the wheel. Look for existing libraries that are commonly used in your area of research. Then your language choice will be narrowed down to possibly only one. If by chance this library doesn't have the performance you need, it will be easier to tune it for speed as compared to starting from scratch. Unless you are in CS, grad school is not where you want to embark on a huge software project if the fundamental coding has already been done by others. You could easily "waste" 1-2 years writing and debugging code without producing anything publishable. BTDT.

    4. Re:Profile by Arker · · Score: 1

      "Unless you are doing something completely novel, don't reinvent the wheel."

      Exactly why I said I would think about looking for the library first instead of the language.

      "If by chance this library doesn't have the performance you need, it will be easier to tune it for speed as compared to starting from scratch."

      False dichotomy. Performance tuning is not something you just pick up overnight like a new language can be, it's serious stuff. If you have to do that yourself you might as well do the whole job and forget about the crappy library. Find one that is already optimized for the tasks that your app will spend most of its time and resources on, then use the most convenient glue to make it work.

      --
      =-=-=-=-=-=-=-=-=-=-=-=-=-=-
      Friends don't let friends enable ecmascript.
  29. Hadoop? by bjzadwor · · Score: 1

    If you are doing a computationally intensive data mining problem, have you considered porting to a Hadoop solution? You may need to rewrite your code, or you may be able to use Hadoop to call your current functions. You could use an AWS Hadoop cluster; Amazon often gives free credits to students, it may cost you nothing out of pocket, and help you learn a hot new technology.

  30. Fortran by Anonymous Coward · · Score: 0

    Recent version of Fortran are very advanced, a lot easier to use than Fortran 77 and still extremely fast.

    Some new features since 77: structured programming, array programming, modular programming and generic programming (Fortran 90), high performance Fortran (Fortran 95), object-oriented programming (Fortran 2003) and concurrent programming (Fortran 2008).

    Free compilers: GFortran and G95

  31. Speed incarnate by Impy+the+Impiuos+Imp · · Score: 2

    If you're using VBA in Excel, you can speed it up a ton by putting this at the beginning of your function:

    Application.Calculation = xlCalculationManual

    And restore it with ...Automatic at the end.

    Do this at the top level with a wrapper function whose only purpose is to disable and enable that, calling the real function in between.

    If you want a real speedup, I am available for part time work in C or C++.

    --
    (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    1. Re:Speed incarnate by Anonymous Coward · · Score: 0

      Excel has tiny limited on rows, though, doesn't it? Scientific data may be 100s of millions of rows.

  32. My favorite is CnH2n+1OH by nanospook · · Score: 3, Funny

    It take all the work out of the computations..

    --
    Have you fscked your local propeller head today?
    1. Re:My favorite is CnH2n+1OH by Anonymous Coward · · Score: 0

      ... where n = 2

    2. Re:My favorite is CnH2n+1OH by Anonymous Coward · · Score: 0

      Insufficient OH

  33. Fortran 90+ with OpenMP or Python by dlenmn · · Score: 1

    If you really want to do heavy lifting, you can't beat Fortran. Just stay away from Fortran 77; it's a hot mess. Fortran 90 and later are much easier to use, and they're supported by the main compilers: gfortran and ifortran.

    ifortran is Intel's Fortran compiler. It's the fastest out there, and it runs on Windows and Linux. Furthermore, you can get it as a free download for some types of academic use. (Search around intel's website -- it's hard to find.) That said, I usually use gfortran -- which is free and open source -- on linux. See http://www.polyhedron.com/compare0html for a compiler comparison.

    If you use Fortran, it's very easy to use OpenMP to do multiprocessing and make use of all those cores. OpenMP is supported by the main compilers.

    If you're doing lighter work, SciPy/NumPy works fine; I use it a fair amount if maximum performance isn't essential. However, I can't speak to its multiprocessing ability.

    1. Re:Fortran 90+ with OpenMP or Python by Eunuchswear · · Score: 2

      Fortran 77 is for weenies. Real men program in FORTRAN 66.

      --
      Watch this Heartland Institute video
    2. Re:Fortran 90+ with OpenMP or Python by Anonymous Coward · · Score: 0

      Fortran 77 is for weenies. Real men program in FORTRAN 66.

      FORTRAN IV FTW

    3. Re:Fortran 90+ with OpenMP or Python by Eunuchswear · · Score: 1

      Geeze, why did i get "funny" for that? I was hoping for informative or insightful.

      --
      Watch this Heartland Institute video
  34. FORTRAN by Anonymous Coward · · Score: 0

    For scientific stuff, FORTRAN is still the best. Simple, old things are very often the best things around. C++ is in many ways a regression, especially all the C-style stuff you can find in the average C++ program.

    As soon as you need to process massive data sets or run massive simulations, all the Script languages won't cut it any longer, so you either go Fortran or C++. So, again, Fortran.

    Before you C++ kids want to tell me something, read up on that Mr Kuck and his optimizers. Fortran optimizers did things about 20 years ago which C++ optimizers still cannot do.

    Finally, there are tons of Fortran libraries already available for all kinds of science and engineering problems.

  35. Why code, when you are use a workflow tool? by Grantbridge · · Score: 1

    Use KNIME and you can probably do 90% of what you want by dragging and dropping a new nodes and joining them up. KNIME does all the complicated memory caching for large filesets for you, and you can write your own Java functions to plug into it if you need something special.

    1. Re:Why code, when you are use a workflow tool? by Anonymous Coward · · Score: 0

      +1

      I advise our grad students to take the output of our pipelines, and play with it in KNIME.

      It also interfaces well with R, JAVA, and Perl

  36. Depends by Enry · · Score: 1

    R, MATLAB, SAS, Python, there's a bunch of languages you can use, and a bunch of ways to store the data (RDBMS, NOSQL, Hadoop, etc.). It really comes down to what kind of access to the data you have, how it's presented, what other resources you have available to you, and what you want to do with it.

  37. Fortran + Python = F2PY by n1ywb · · Score: 4, Informative

    Better yet, Fortran + Python.

    http://docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html#f2py

    I used it to wrap some crazy magnetometer processing code written in Fortran into a nice Python program. I ripped out all the I/O from the Fortran code and moved it into the Python layer. It worked great. Fortran is AWESOME at number crunching but SUCKS ASS at IO or well pretty much anything else, hence Python.

    --
    -73, de n1ywb
    www.n1ywb.com
    1. Re:Fortran + Python = F2PY by HiThere · · Score: 1

      How long is the expected to last? Last I checked F2PY didn't work with Python3. I can't remember whether it worked with versions of fortran past Fortran95. Python2 is still being supported, but it's probably only got a couple of years before it's mothballed. (I think the original promise was 5 years of support after Python3 was released.)

      OTOH, perhaps Python2 will be more durable than expected. Or perhaps F2PY supports...when I checked just now I got a bunch of people reporting errors in using F2PY with Python3, but many of them were over a year old, so maybe it's working now.

      OTOH, a modern Fortran is actually pretty good just by itself if you don't need bunches of external libraries. And as for I/O...well, nothing is good at that, it's too general a requirement. Fortran77 is quite good a binary I/O, I haven't tried a recent fortran. OTOH, I will agree that Fortran77 is rather poor at character I/O.

      One consideration that you might have is handling unicode. The last time I checked (quite awhile ago) fortran (all of them) was as poor as C at dealing with unicode, whereas Python handles it quite well. I don't know whether this has changed (or which version of Fortran you would be using, or whether it matters for your application).

      OTOH, support in dealing with Fortran is rather limited. If you don't know someone, you may have trouble finding support. Few people use it. As you said you haven't used it for a rather long time, this may be significant.

      The real answer is that it depends on details of your application. It depends on what you mean by a large amount of data. It depends on your computer. (Also, my recent experience is with Linux. You appear to be using MSWind, judging by your comment about VBA. I have no knowledge of how any Fortran works on an MSWind system, but I suspect that it's barely supported. Which would make Python a better choice. (You said you could load the system with either MSWind or Linux, but there's much to be said for familiarity.)

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    2. Re:Fortran + Python = F2PY by Anonymous Coward · · Score: 0

      Better yet, Fortran + Python.

      http://docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html#f2py

      I used it to wrap some crazy magnetometer processing code written in Fortran into a nice Python program. I ripped out all the I/O from the Fortran code and moved it into the Python layer. It worked great. Fortran is AWESOME at number crunching but SUCKS ASS at IO or well pretty much anything else, hence Python.

      agreed, Fortran for the numbers and python for the wrapper. if your not used to python, you could use C. which you know already don't fix what isn't broke.

    3. Re:Fortran + Python = F2PY by Anonymous Coward · · Score: 0

      2nd Fortran+Python. I have used fortran all my career (30 years) for hard core, fast applications (from PC to supercomputer) but in recent years have been mostly using Python for day-to-day calcs and as a front end for my fortran code. And using cython we have got near fortran performance for lightly modded python code.

      But I'd also 2nd another earlier comment ... if you are rusty on programming you will need to talk with colleagues about problems you have so there's lots of advantage of going with the flow and using whatever other people in your group are using.

    4. Re:Fortran + Python = F2PY by occasional_dabbler · · Score: 1

      F2Py is so insanely useful, if it falls out of support I'l ltake it up myself.

      --
      "Our opponent is an alien starship packed with atomic bombs," I said. "we have a protractor"
    5. Re:Fortran + Python = F2PY by mvdw · · Score: 2

      OTOH, exactly how many hands do you have??

    6. Re:Fortran + Python = F2PY by mjwalshe · · Score: 1

      Oh come FORTRAN io isn't that hard on you just have to suck it up and learn to use the format statement - hell i worked on a billing system that was 70% FORTRAN 77

    7. Re:Fortran + Python = F2PY by plopez · · Score: 1

      Nah. Fortran 2003 and 2008 pretty much obviate a need for a language like Python

      --
      putting the 'B' in LGBTQ+
    8. Re:Fortran + Python = F2PY by n1ywb · · Score: 1

      Fortran is widely used in science (speaking as a scientific developer here) and perfectly supported on Windows.

      http://software.intel.com/en-us/fortran-compilers

      I love the "automatically paralellize" checkbox in the intel fortran settings. Take that, C et al.

      --
      -73, de n1ywb
      www.n1ywb.com
    9. Re:Fortran + Python = F2PY by n1ywb · · Score: 1

      Maybe it's just bad developers but any software that reads config data based on fixed line numbers is crap IMNSHO and just about every fortran program I've worked with worked that way. It's std lib is jonesing for an .INI parser or something.

      --
      -73, de n1ywb
      www.n1ywb.com
    10. Re:Fortran + Python = F2PY by mjwalshe · · Score: 1
  38. Depends... On the Data... by tiberus · · Score: 1

    Well, it depends. You say " computationally intensive data mining problem" but, what kind computations (arithmetic, mathematical, text-base, etc.).

    In general for flat out speed, toss interpreted languages out (Perl, Python, Java, etc.) the door. You'll want something that compiles to machine code, esp. if you are running on older hardware. Crunching numbers, complex math, matrices then Fortran is the beast. If you're data is arranged in lists, consider lisp, then pick something else as it will likely give you a migraine. The format of your data and what you need to do with it will drive your language choice.

    Is finding a partner an option? Seems you should be able to work with someone from CS who needs a coding project...

  39. Python or R by Anonymous Coward · · Score: 1

    I work in the industry (all our customers are scientists), and the two languages that seem to be predominant are R and Python. R has lots of cool stuff specifically for advanced number crunching, while Python is more the swiss army knife that can be used to tackle anything. I don't think you can go wrong with either, but Python will probably be more friendly (eg. it has way more books on it than R) and will serve you better in non-scientific enterprises.

  40. Python, numpy, Pyvot by shutdown+-p+now · · Score: 4, Informative

    Since you mention VBA, I suspect that your data is in Excel spreadsheets? If you want to try to speed this up with minimum effort, then consider using Python with Pyvot to access the data, and then numpy/scipy/pandas to do whatever processing you need. This should give you a significant perf boost without the need to significantly rearchitecture everything or change your workflow much.

    In addition, using Python this way gives you the ability to use IPython to work with your data in interactive mode - it's kinda like a scientific Python REPL, with graphing etc.

    If you want an IDE that can connect all these together, try Python Tools for Visual Studio. This will give you a good general IDE experience (editing with code completion, debugging, profiling etc), and also comes with an integrated IPython console. This way you can write your code in the full-fledged code editor, and then quickly send select pieces of it to the REPL for evaluation, to test it as you write it.

    (Full disclosure: I am a developer on the PTVS team)

  41. If you care about the answer... by mpmansell · · Score: 0

    then you should care about the code, as well. Choice of language can have a lot of consequences for accuracy and floating rounding errors need to be accounted for, and these may differ per language and implementation version of each language.

  42. matlab by smadasam · · Score: 3, Informative

    FORTAN used to be it back in the day, but now days Matlab is the stuff that many engineers use for scientific computing. Many of the math libraries are very good in Matlab and don't require you to be a computer scientist to make them run fast. I used to work with scientists in my old lab to port their Matlab code to run on HPC clusters porting them to FORTAN or C. Often the matlab libraries smoked the BLAS/Atlas packages that you find on Linux/UNIX machines for instance. The same would hold true for Octave since they just build on the standard GNU math pacakges like BLAS.

    1. Re:Matlab by Endloser · · Score: 1

      You not having written code in 12 years makes me second this suggestion.
      Matlab is in wide use in universities and research facilities around the world.

      Remember you care about the result not the code.
      Matlab will take advantage of your system in ways that you only could had you been programming the past 12 years and not needed to ask this question.

    2. Re:Matlab by coolsnowmen · · Score: 1

      matlab has an open source + free version called octave if you just want to learn the language. You might need to purchase matlab if you want one of their speciliazed "tool boxes"

      I used matlab for about 7 years.

    3. Re:Matlab by Anonymous Coward · · Score: 0

      No Matlab. Not portable, not open, and it perpetuates a vendor lock-in for quantitative scientists/engineers every bit as bad and destructive as the stranglehold Windows has enjoyed on the desktop for decades.

      Python is more readable, more enjoyable to code, has equivalent IDEs available (Spyder), far more user-friendly features, you can use your code literally anywhere you go without worrying about a Matlab license, and the SciPy Stack has reached functional feature parity with Matlab (and is evolving well beyond in certain areas).

      Furthermore, indented blocks force good code practices! iPython lets you have a Python workbook! No more semicolons! Multiple function declarations in a true package, instead of dumping loads of files in a working directory!

      iPython is leagues ahead of Matlab in terms of scripting, and Python programming is miles ahead of Matlab.

      Never recommend Matlab to anyone. The future is open. The future is Python.

    4. Re:MATLAB by burdickjp · · Score: 2

      It's also not free, under any definition, and proprietary, meaning you're making your development dependent on the availability of MATLAB as a resource. Learning a free language, such as Python, would free you from the cost and availability restraint, and mean you are learning a more general, and thus more useful, language.

    5. Re:Matlab by Anonymous Coward · · Score: 0

      +1

    6. Re:Matlab by Anonymous Coward · · Score: 0

      As someone who must deal with a lot of matlab code, don't. It looks nice and the math syntax is great. Everything else in the language is a pile of shit. From pass by value (and only pass by value) to magic "handle" classes and tacked on OO support, it's not a language worth doing anything but calling some flavor of `plot` in.

    7. Re:Matlab by theEnguneer · · Score: 2

      Matlab is great for testing out ideas, but it is slow compared to Python or C. Also, for doing Data Mining, Matlab is a poor choice because the whole point of Matlab is to make it so the user doesn't have to worry about variable/data storage, which is the thing that data miners need to optimize.

    8. Re:Matlab by Anonymous Coward · · Score: 0

      Don't use a programming language. Use a tool like Matlab or Mathematica instead. These tools are well designed for scientific computing and have sufficient scripting built in to support the programming-language-like functionality you're probably looking for.

      You won't be able to call yourself a programmer. But you're not a programmer, you're a scientist.

      Ditto. Matlab is a dream. You get to focus on your work instead of on programming. And a standard technique is to prototype in Matlab, profile the code, and substitute compiled FORTRAN for the slow parts. (Ditto above comments about modern FORTRAN.)

    9. Re:matlab by J.+J.+Ramsey · · Score: 1

      The catch with Matlab is that outside of academia, it is expensive. If one's workplace has the licenses for it, great. If not, it may be better to make do with Octave, or Numpy & Scipy.

    10. Re:matlab by hyperfl0w · · Score: 1

      Matlab user here: it is EXPENSIVE.
      Leaving your company means leaving your language.

      One should not have have to leave your language behind just because you leave your people.

    11. Re:Matlab by hyperfl0w · · Score: 1

      I did this! New company doesn't want to pay for matlab. :(

  43. Same language as your piers by willy_me · · Score: 1

    If you want to be able to ask someone for help then it would be best to use the same tools they use. The point is that any programming language will work. Some languages are easier then others but the difference is negligible compared to the advantage of being able to ask your piers for assistance.

    1. Re:Same language as your piers by the+eric+conspiracy · · Score: 1

      Wooden posts stuck into the ground don't use programming languages.

    2. Re:Same language as your piers by WillAffleckUW · · Score: 1

      Maybe they're using metallic or plastic piers?

      We used to run S on our GRID computers to figure out proper pier placement, after a discussion with our peers.

      --
      -- Tigger warning: This post may contain tiggers! --
    3. Re:Same language as your piers by rk · · Score: 1

      I have worked at places that disprove your statement.

  44. Try J.. by DavidHumus · · Score: 1

    ...at jsoftware.com .

    It's more powerful, concise, and consistent than most languages. However, R and Matlab have larger user communities and this is an important consideration.

    There was a note on the J-forum a few months ago from an astronomer who uses J to "...compute photoionization models of planetary nebulae." His code to do this is about 500 lines in about 30 modules and uses some multi-dimensional datasets, including a four-dimensional one of "...2D grids of the collisional cooling by each of 16 ions".

    However, the point of his note was that he ported this code to his i-phone - and it works! Consider, too that porting consists mainly of copying some text and data files - there would be little to no code changes.

  45. do you have a budget? by Anonymous Coward · · Score: 0

    you haven't given a lot of info on specifics of what you're trying to do, but i'm assuming something like crunching through tables of data with possible aggregations, filtering and sorting with possibly a few custom calculations based on the raw data. kind of stuff you can do in excel on a small scale.

    so my first question is have you looked at excel 2010 or 2013? if not they're much better at bigger data than previous versions. but excel does have it's limits....

    if you have a budget for commercial software, then something like matlab might work. it is uber fast, can handle multiple cores/64-bit and is extremely well documented on their website with copious examples and documentation. the pace of updates at 2x per year is also good with steady incremental improvements.

    if you have no budget then python+matplotlib+ipython+pandas is an excellent combo. it's what i use. free and productive once you learn the ropes. and you spend minimal time on learning a programming environment, etc. if you can do VBA, you can definitely do python. and with pandas it can be quite fast.

    as a final thought, if you're really just doing data mining and have the data somewhere like in a database, you might want to consider some of the newer tools like tableau. no/minimal programming required to do some pretty nice analysis and it's dead simple to play around with new ways of looking at the data.

  46. the language is probably not the issue by kbdd · · Score: 0
    Nowadays, most languages can be pretty efficient. Your algorithms may be where the problem lies. Most any language can run most algorithms efficiently, but achieving that may not be easy.

    No language will give you a magic speed boost if you do not understand how it processes the numbers and data structures.

    My recommendation is probably not what you want to hear: pick a language that you are comfortable with and study it so that you know how to write efficient code with it.

  47. C/C++ by ericcc65 · · Score: 3, Interesting

    I'm a MSEE and I've been working in the digital signal processing realm for the last 10 years since graduating. I should mention that I haven't done a lot of low level hardware work, I haven't programmed actual DSP cards or played with CUDA. I have written software that did real-time signal processing just on a GPU. Everyone in my industry at this point uses C or C++. There is some legacy FORTRAN, and I shudder when I have to read it. Some old types swear by it, but it's fallen out of favor mostly just because it's antiquated and most people know C/C++ and libraries are available for it.

    For non-real-time prototypes I'd recommend learning python (scipy, numpy, matplotlib). Perhaps octave and/or Matlab would be useful as well.

    At some point you have to decide what your strength will be. I love learning about CS and try to improve my coding skills, but it's just not my strength. I'm hired because of my DSP knowledge, and I need to be able to program well enough to translate algorithms to programs. If you really want to squeeze out performance then you'll probably want to learn CUDA, assembly, AVX/SSE, and DSP specific C programming. But I haven't delved to that level because, honestly, we have a somewhat different set of people at the company that are really good in those realms.

    Of course, it would be great if I could know everything. But at the moment it's been good enough to know C/C++ for most of our real time signal processing. If something is taking a really long time, we might look at implementing a vectorized version. I would like to learn CUDA for when I get a platform that has GPUs but part of me wonders if it's worth it. The reason C/C++ has been enough so far is that compilers are getting so good that you really have to know what you're doing in assembly to beat them. Casual assembly knowledge probably won't help. I might be wrong, but I envision that being the case in the not too distant future with GPUs and parallel programming.

    1. Re:C/C++ by ericcc65 · · Score: 1

      Edit: GPU in the first paragraph should be GPP, general purpose processor.

    2. Re:C/C++ by Anonymous Coward · · Score: 0

      There is some legacy FORTRAN, and I shudder when I have to read it. Some old types swear by it, but it's fallen out of favor mostly just because it's antiquated and most people know C/C++ and libraries are available for it.

      Have you heard of FORTRAN 2008? 2003? 95? 90? I can only think that Fortran is so advanced that most cannot get that it didn't stop at F77.

  48. openCL, MATLAB/octave, Python by Anonymous Coward · · Score: 0

    If you really need fast number crunching and have a highly-parallelizable problem, consider openCL (or CUDA/directCompute, but those are less generic). There is a bit of a learning curve, but the results are worthwhile for these types of problems.

    For powerful expressiveness with large datasets, I like MATLAB/Octave. For universal-ness that's easy to learn and easy for others to understand, go with Python - it's very common for certain types of simulations and models and I can understand why.

    Do not even consider using Java - you will regret it.

  49. Quick suggestion... by MiniMike · · Score: 2

    Do you have access to MATLAB or a similar analysis tool? Many universities have licenses, and overall it seems like it might be a good choice for you. These programs usually have a lot of build-in functionality that will be difficult to reproduce if you are not an experienced scientific programmer.

    I haven't done ANY programming in about 12 years, so it would almost be like starting from scratch.

    This is probably a bigger problem than choosing which language to use. If you don't know how to program properly and efficiently, it doesn't matter which language you choose. If you go this route I'd suggest taking a course to refresh or upgrade your skills. Since you're familiar with C that might be a good language to focus on in the course. Another factor is if you have to work with any existing libraries it might limit your choices. I program in C, FORTRAN, and VB and find that for computationally intensive programs C is usually the best fit, sometimes FORTRAN, and never VB.

    1. Re:Quick suggestion... by Anonymous Coward · · Score: 1

      NO.

      No Matlab. Not portable, not open, and it perpetuates a vendor lock-in for quantitative scientists/engineers every bit as bad and destructive as the stranglehold Windows has enjoyed on the desktop for decades.

      Python is more readable, more enjoyable to code, has equivalent IDEs available (Spyder), far more user-friendly features, you can use your code literally anywhere you go without worrying about a Matlab license, and the SciPy Stack has reached functional feature parity with Matlab (and is evolving well beyond in certain areas).

      Furthermore, indented blocks force good code practices! iPython lets you have a Python workbook! No more semicolons! Multiple function declarations in a true package, instead of dumping loads of files in a working directory!

      iPython is leagues ahead of Matlab in terms of scripting, and Python programming is miles ahead of Matlab.

      Never recommend Matlab to anyone. The future is open. The future is Python.

    2. Re:Quick suggestion... by umafuckit · · Score: 2

      NO.

      No Matlab. Not portable, not open, and it perpetuates a vendor lock-in for quantitative scientists/engineers every bit as bad and destructive as the stranglehold Windows has enjoyed on the desktop for decades.

      I think you're over-stating things a touch. Some of the core stuff is closed source but most of the functions are open, meaning that they are readable .m scripts. e.g. if you're worried about how MATLAB implements ANOVA then you read the file and check. You can modify if needed. So MATLAB is open enough in most normal usage scenarios. You're not really locked in given that we have Octave.

      Python is more readable, more enjoyable to code, has equivalent IDEs available (Spyder), far more user-friendly features, you can use your code literally anywhere you go without worrying about a Matlab license, and the SciPy Stack has reached functional feature parity with Matlab (and is evolving well beyond in certain areas).

      I like Python and I've spent some time learning it recently and ported some of MATLAB code. Python is not a panacea, however. For starters, there is no equivalent of the excellent MATLAB docs. For a newcomer with no programming experience, the entry barrier is definitely higher. Much more Googling needed to get stuff to work. The plots produced by Matplotlib are good but don't do everything. e.g. I found animating data was too slow in Matplotlib and I spent ages messing about with pyqtgraph. So to get the most out of it you have to screw about with different plotting packages and that can be very time consuming. In general, the syntax for matrix operations is a lot more elegant and economical in MATLAB than in Python/numpy. I also ran into issues where seemingly equivalent code would run substantially slower in Python than MATLAB. In many cases I was able to resolve the issue and surely to a degree it was due to me being a numpy beginner, but I do feel it's easier to get the most out of MATLAB than the most of out of numpy. MATLAB has now become quite smart about helping the user to optimise code. Admittedly this might make the user a less careful programmer: I definitely learned things whilst trying to get Python code to run at the same speed as my original MATLAB code. So the process was useful. I'd like to use Python more in the future, but rabidly hating on MATLAB isn't fair. Finally, they're pretty different languages: Python is a general-purpose language which has been adapted to number crunching, whereas MATLAB was designed for number-crunching from the ground up. When you use these languages, their heritage shows.

    3. Re:Quick suggestion... by Anonymous Coward · · Score: 0

      Actually, MATLAB's openness is poison. They retain copyright over their code, but dangle it in front of you saying "hey check this out!"

      If you ever - EVER - look at a single one of their scripts for any reason, even by accident, you cannot legally contribute any like function to any open source project. For the rest of your life. If you do, perhaps unconsciously using some similar motifs, they will sue you personally and the project out of existence.

      This is how MATLAB perpetuates their stronghold, and it's terrible. It's actually worse than if they were entirely closed-source.

      I'll grant that by default, Python doesn't come with easy docs like `man `. However, using iPython solves this trivially.

  50. Use an interpreted language that calls C libraries by pigiron · · Score: 1

    newLISP is small and can easily call most c/c++ libraries, plus Java for graphics. HTML/XML are really just LISP S-expressions for all practical purposes. Throw in a little Unix/bash and you are there.

  51. VB by confused+one · · Score: 1

    Personally, I would do it in C unless you have Fortran libraries you want to use, then I'd use Fortran. However, if you have existing VBA code you want to leverage, I'd just use VB.Net, import the core parts of the code and run with it. There's a moderately steep learning curve going from VB6 or VBA to VB.Net; but, it'll be much less effort than learning a new language.

  52. R actually by Anonymous Coward · · Score: 0

    Perl is the second one, but if you actually mean real science, you need to learn R, or at least S, in addition to Perl.

    C is a good choice too, as is C++.

    We use those in real science.

  53. What is worth your time? by BruiserBlanton · · Score: 1

    It depends on what you willing to deal with.

    Python is good if you don't need to very heavy array code. I know you can use Python libraries that give you access to good arrays but I think of Python as a scripting language. It's good for a quick prototype as well, but for heavy computation, I would move on to a compiled language.

    Fortran 90 or Fortran 2003/08 is what will be the most like what the mathematical syntax you'll use. Despite what people may tell you, it is possible to write code that is understandable and reusable in Fortran, it just takes a great deal of understanding when you design the code. Most people have only seen Fortran code that was either hacked together or is so heavily optimized that it has been obfuscated.

    C++ is good as well but you'll spend more time figuring out how to express your mathematics and to use the arrays than you might might find productive. In my group, we do computer science parts of our codes in C++, but numeric calculations and heavy-duty array manipulation is done in Fortran.

    The thing about taking advantage of the multiple core machine is much deeper than simply choosing a language. There are MPI and OpenMP libraries that are very good for Fortran and C++. However, producing efficient code that is parallelizable requires changing and complicating the algorithm for a well understood and functioning serial code. Writing effective parallel code will take you much more time than picking up a programming language.

  54. details would be fantastic. by Anonymous Coward · · Score: 0

    How easily does your problem parallelize? How slow is too slow? Why, exactly, do you "want" to use all four cores?

    Has someone solved a variant of your problem before? Since you're doing data mining, the answer is most likely 'yes', in which case unless you're a masochist or have something to prove to yourself, you want to adapt what they've done. Hell, it's quite likely that there's a nice-enough implementation in a standard software package already (R comes to mind). A few hours spent mathematically/conceptually massaging your problem into a canonical form can save you days of programming, and will train you to make useful analogies too. It won't be optimized, but that shouldn't be your concern now. Days or weeks of coding to save a few hours or days is not a smart investment.

    It's really easy to get into a trap of focusing on coding, and frankly asking on Slashdot will probably lead you further that way. Sometimes you do need to focus on coding, but it should always be in the mindset of automating something you could (at least in principle) be doing by hand. For a quantitative researcher, programming is, itself, a subroutine, not an ends in itself.

  55. Matlab by necro81 · · Score: 2

    If you are working in academia, then you probably have access to Matlab. Matlab, as a language, has both scripting abilities and programming abilities. The scripting was born from Matlab's roots in Unix, which makes it handy for batch processing lots of files. It's programming functions started off as C, but has since incorporated features from C++, Python, and Java. The programming side of it has, in my opinion, more structure and formalism than Python, but makes certain things like file IO and data visualization (i.e., graphing) easier than straight up C/C++. The basics of using it can be picked up in an afternoon, and the sky's the limit from there. There are lots of well-written and documented functions built in; specialized toolboxes can be had for additional fees. There's a fair bit of user-generated code out there. Plus, I expect you can find a lot of people around you who know plenty about it.

  56. Multi-threading by Anonymous Coward · · Score: 0

    For a free windows compiler, go with MinGW.
    Linux uses the GCC standard. You can also go with LLVM/CLang.
    All of these support the C++11 standard which in turn supports multi-threading out of the box.
    http://solarianprogrammer.com/2011/12/16/cpp-11-thread-tutorial/
    http://en.cppreference.com/w/cpp/thread
    http://cpprocks.com/wp-content/uploads/C++-concurrency-cheatsheet.pdf

    As for Octave, see here: http://stackoverflow.com/questions/11889118/get-gnu-octave-to-work-with-a-multicore-processor-multithreading

    If you want true horse power and are willing to work for it, invest in a compatible AMD or Nvidia card that works with OpenCL and spend some time learning that.
    http://opencl.codeplex.com/wikipage?title=OpenCL%20Tutorials%20-%201
    http://www.drdobbs.com/parallel/a-gentle-introduction-to-opencl/231002854
    http://enja.org/2010/07/13/adventures-in-opencl-part-1-getting-started/

  57. Sage + by jbolden · · Score: 0

    Let me second this one. Mathematica, Maple, Sage, Matlab / Octive... Mathematical languages are so nice for scientific computing because the languages have wonderful built in functions.

  58. Not enough infomation by already_read · · Score: 1

    The answer would really depend on the nature of the problem. If you are doing more statistics type processing then R is commonly used in academia. Python might be good in the short and medium term, but you will probably want to get acquainted with C++ if you are serious.

  59. Don't start over by Anonymous Coward · · Score: 0

    You're falling for one of the most common traps in programming. It doesn't do what I want, so I'm going to start over from scratch. You'll waste a lot of time doing things you've already done and debugged.

    So what you should do is keep the VB program. Identify the slow parts, most likely the inner most section of the inner most loop, and convert that to a C or Fortran module. Ideally use a "message passing" interface so the C or Fortran code can be multi-threaded while the VB portion stays more or less as is.

    Do just a little bit at a time, so you can see the actual progress. Chose between C or Fortran based on the availability of libraries that make your computation easier.

  60. Fortran by quietwalker · · Score: 1

    I worked as a sysadmin for a high energy physics group at the Beckman Center. Day and night, it was Fortran, on big whopping clusters, doing monte carlo simulations.

    Though it ~was~ many years ago.

    Elsewhere, I worked for a company doing datamining on massive datasets, over a terabyte of data back in 2000, per customer, with multiple customers and daily runs on 1-5 gig subsets. We used C + big math/vector/matrix libs for the processing because nothing else could come close, and Perl or Java for the data management; preprocessing, set creation and munging (like attempting to corrrect spelling mistakes, parsing date strings into a standard format, normalizing data against a standard metric, applying expert system filters, even actual machine analysis like clustering or shape detection, which to us was still just preprocessing).

  61. Matlab by Spazmania · · Score: 1

    Don't use a programming language. Use a tool like Matlab or Mathematica instead. These tools are well designed for scientific computing and have sufficient scripting built in to support the programming-language-like functionality you're probably looking for.

    You won't be able to call yourself a programmer. But you're not a programmer, you're a scientist.

    --
    Moderating "-1, Disagree" is simple censorship. Have the guts to post your opinion.
  62. You want to use all your cores? Stick to FP! by Anonymous Coward · · Score: 1

    Stick to 'functional' flavored languages, especially ones that are geared toward composing concurrent code particularly easy and feature immutability of variables as a default, as that's particularly important when you've got a lot of working parts running in parallel. We've been pulling away from the higher-level languages reimplementing most of Lisp poorly toward languages that more and more resemble the ML family with each year that goes past. Haskell is the current darling of that group, though I'd not suggest a lazily evaluated language to a beginner as it's particularly difficult to reason about compared to the more typical eager evaluation. Still, you can try taking a crack at Learn You a Haskell if you want to see a brief glimpse at one of the most interesting languages floating around at the moment, especially since there's ample resources available for study compared to say, Standard ML, my personal favorite.

    I would keep my eyes out on Rust in particular, which has taken a few of the better ideas from ML (pattern matching, type inference) and has tried to graft it to a more pragmatic set of trade-offs, such as pushing garbage collection from a required performance hindrance to a per-task elective option and has focused on strong interoperability with C and its calling conventions. It also was a day-one design decision to focus on parallelism and concurrency, which is part of why I cannot directly suggest OCaml at the moment, as that is its biggest weakness in otherwise a very robust and well-established member of the ML family.

    I would avoid Go for myriad reasons, unless you want to cover a third of your source-code with explicit error handling or just silently discard them, suffer through the lack of generics (not everything has to be shit with generics like Java and C++, guys, c'mon) or just really enjoy null errors and weird versioning issues that crop up during project development because the Go build system is a bit too simplistic and their solution is to just pull all your dependencies into your own tree to avoid the issue.

  63. R, Perl, some C by digitalhermit · · Score: 2

    I run lots of statistical analyses. Most of the code is in R with some wrappers in Perl and some specific libraries in C. The R and Perl code is pretty much all my own. The C is almost entirely open source software with very minor changes to specify different libraries (I'm experimenting with some GPU computing code from NVidia). Most of the people who are doing similar things are using Python with R (or more specifically, the people I know who are doing the same thing are using Python/R).

    An average run with a given data set takes approximately 20 minutes to complete on an 8-core AMD 8160. About 80% of the run is multi-threaded and all cores are pegged. The last bit is constrained mainly by network and disk speed.

    You may consider using something like Java/Hadoop depending on your data and compute requirements. Though my Java code is just a step above the level of a grunting walrus, I've found that the performance is actually not that bad and can be pretty good in some cases.

     

  64. Language is the start, learn the frameworks! by Anonymous Coward · · Score: 0

    A programming language is just the start, which can be mastered in relatively small time (weeks). When you've completed this, start looking for third party packages.
    Most of the building blocks have been created and are available on the web. To you the task to use them to build your application.

    For example: people say Python, but only because that has numpy and scipy frameworks that enable you to do a lot of numerical stuff.

  65. Let's cut through the BS and get serious by satan666 · · Score: 1

    As others have pointed out, if you are going to do serious research,
    you dont want to mess around with the latest and greatest.
    Stick with something that has been around and works.
    COBOL is that language!

    Ha! Made you smile :P

    Seriously tho, use C or FORTRAN. Nothing beats them for optimization and speed.
    Cheers!

  66. python by memaartje · · Score: 1

    it's easy to learn, it's fast, it's suitable for almost any task because there are many ready to use libraries out there. thus python. btw I learned python in three hours! see github.com/masikh for my work.

  67. I recommend C#.NET by dryriver · · Score: 1, Informative

    Not only is C# easy to learn, and easy to both read and write, it also runs at a fairly high speed when it is compiled. To make use of multiple CPU Cores, C# has a neat feature named PARALLEL.FOR. If your algorithm scans across a 2D Data Array using a FOR LOOP at all, Parallel.For will automatically break that array into smaller arrays, and have each calculated by a different CPU core, resulting in a much faster overall computation speed. I develop algorithms in C# and highly recommend it if you want a) a nice, readable code syntax and b) fast execution speed. I hope this helps...

    --
    Why did the chicken cross the road? Because Elon Musk put an AI chip in its head.
    1. Re:I recommend C#.NET by carnaby_fudge · · Score: 0

      I second this recommendation. With Visual Studio Express 2012 you'll find it's very straight forward to get up and running.

  68. Re:Depends... On the Data... by K.+S.+Kyosuke · · Score: 1

    If you're data is arranged in lists, consider lisp,

    Oh please! It's not like Lisp doesn't have any other data structure, is it? You can have your multidimensional numerical arrays in CL quite easily. (I'm saying neither "use CL" nor "don't use CL", merely that your argument is pretty weak. It's easier to learn to work with lists in the language you already know (unless it's COBOL!) than to learn an entirely different one just because of lists.)

    --
    Ezekiel 23:20
  69. There is no right answer... by Anonymous Coward · · Score: 0

    The correct answer is that it depends on your algorithm, and what bits of it you need to modify. If it's something that can mostly be coded in terms of existing algorithms, then find a library that implements those, learn the appropriate language enough to modify it, and carry on. The best thing to do may be to ask other people in your group/research field to see what tools they use.

    If you're looking for existing libraries for data-mining, I suspect the answer is unlikely to be Fortran, even though that's probably the best language for scientific computing in general, followed closely by C++ (which is *very* hard to learn). The answer may be Python or even MatLab, if it has the appropriate data-mining tools available.

    To use all four cores, if you're very lucky, your data mining might be easy to parallelise, in that you can give separate pieces of the data to each core and use them that way. Otherwise, unless you're using an existing library, you will not be able to write a parallel code in a few weeks.

    However, you said "I need a language I can pick up in a few weeks so I can get back to my research. I am not a CS major, so I care more about the answer than the code itself."

    You don't say what your course/subject is. If it's anything that requires computing a lot. I'm sorry to have to say this, but programming is part of your research. If you're experimenting with algorithms and modifying them, you're in the realms of scientific computing, and you will have to bite the bullet and learn a decent programming language (Fortran or C++ probably), and you will end up using it. On the other hand, the effort should pay off, since you will become more productive, be able to experiment with new algorithms more easily, etc.

    Also, most computational scientists are not CS majors, they are probably mathematicians, physicists, or something similar initially, who realise they need to learn to program, and have the mindset and ability to do so. You should care about the code just as much as the answer. What happens when your paper is submitted to a journal and you need to make corrections? You will need to dig out the code, make modifications to it, debug it, and rerun it, 6 months after you thought you'd seen the back of it. If you haven't spent time making the code easy to read in the first place, you will become unstuck at this point.

    Welcome to Scientific Computing. It's not the same as computer science, but you still get to play with big shiny computers and do lots of wonderful stuff with them.

  70. Stupid questions by Anonymous Coward · · Score: 0

    English is the best language to learn it!

    Those "the best language" questions are the most stupid ones.

  71. old xeon box? linux vs. xp by Fubari · · Score: 1

    In order to realize all possible performance from your hardware, I would suggest linux over XP.
    With xeons going 64-bit around 2005, it would have to be really old to be only 32 bit.
    And even if it was an ancient 32-bit only xeon, XP is still going to have issues using more than 3.5 gb ram.
    XP process management seems weak to me compared to the linux side of things.

    I don't have a favorite brand of linux to recommend; I would ask your professors and fellow researchers if they have a preference (because they are going to be your go-to support crew).

    In any event, I would try to max out the ram your specific motherboard can handle.
    And I would beg/buy/borrow/steal a modest SSD to run the OS on, you can probably get both for $100 or so.
    Keep your data sets on the slower spinning-rust drives.

    One especially insightful response I saw above was asking about what kind of computation you're running.
    The python guys are probably right.
    I suspect your problems with VB is it will be single-threaded, and (I'm not a VB developer, I've just had to cope with it from time to time) not so generous with efficient data types.
    I've had some awful experiences trying to run multi-threaded procsses on XP and Java.
    I think you'd get better results from ditching XP.
    Your actual language doesn't matter as does some parallel-capability.

    Finally, the good news: almost anything is certain to be better than running VB in XP.
    The fact that you could implement your solution VB suggests that it is not crazy complex.
    Doing it in raw C will be a pain because you'll have to code your own process management.
    I'd be very interested in seeing if numpy or perhaps "R" can do the math that you need.

    Do follow up and let us know what you end up doing.

    1. Re:old xeon box? linux vs. xp by Anonymous Coward · · Score: 0

      And I would beg/buy/borrow/steal a modest SSD to run the OS on, you can probably get both for $100 or so.
      Keep your data sets on the slower spinning-rust drives.

      If he's going to keep the data sets on the spindles then I see no reason at all to invest in a SSD. All calculation takes place in ram, it is loaded and written to spindles... Yeah the computer will boot in 15 seconds instead of 75, but how often is this thing going to be rebooted?

  72. Congratulations! You are a sysadmin! by Artagel · · Score: 1

    It sounds like you have control of the whole machine, which makes you the sysadmin. You don't only get to choose the programming language. You have to design a workflow. The programming language will fall out of you designing your plan of attack. You have to do so within the limitation of your advisor's budget, the assistance you can beg, etc. Take comfort in the fact that procedural languages are deep down 98% the same with different words for things, it is the libraries that get confusing. And read the library documentation like your life depends on it. It does.

  73. Scientific programming by Anonymous Coward · · Score: 0

    The lab I work in does a lot of what qualifies as "Scientific programming", including doing a lot of computationally expensive analyses.

    We use Perl as a prototyping language, and then convert the code to C++ once the prototyping is done. Perl is easier to write and develop in, but C++ is far more efficient for computationally expensive programs.

    Perl vs. Python is mostly a personal preference thing. And by personal preference, I mean the preference of your PI. Some labs use one, some labs use another, and the professors tend to have very strong opinions on which one should be used. So, use whichever your PI prefers. Also check to see what his feelings are on publically available modules -- some PIs want you to write the code yourself, even if a version already exists in the wild.

  74. Get a pet programmer by Anonymous Coward · · Score: 0

    They're cheap and run on caffeine.

  75. VB.NET by Anonymous Coward · · Score: 0

    You said you already used VBA. While not a direct descendant, VB.NET would be faster than recoding in any other language. You will likely be able to copy and paste much of your code without much rewriting.

    Visual Studio Express is completely free by the way.

  76. google seo vn by Anonymous Coward · · Score: 0

    Do you have access to MATLAB or a similar analysis tool? Many universities have licenses, and overall it seems like it might be a good choice for you. These programs usually have a lot of build-in functionality that will be difficult to reproduce if you are not an experienced scientific programmer.

    I haven't done ANY programming in about 12 years, so it would almost be like starting from scratch
    http://googleseovn.blogspot.com/

  77. Standard Language in the Field by Anonymous Coward · · Score: 0

    Look at other people in your scientific specialty. What languages are they using? What about related specialties?

    For example, I don't know macrobiologists (wildlife conservation, etc.) that do much programming. What I saw was VBA, with S+ for statistics.
    Microbiologists love to use python, sometimes use perl, and occasionally branch out to other languages.
    In bioinformatics I saw python and matlab, with C or C++ for parallel, high-performance pieces.
    In proteomics I saw parallel code written in C, run exclusively on supercomputers.

    Science will require collaboration at some point, so use tools your collaborators will be familiar with. Otherwise you need a convincing reason why technology X meets your needs much better than Y.

  78. PHP by Anonymous Coward · · Score: 1

    PHP because it's fast, fun and scalable!

    *ducks for cover*

  79. Think about your software design before new lang by Anonymous Coward · · Score: 0

    There is often much to be gained from thinking about smarter ways to implement your algorithm. Do you have nested looping where parts can be unrolled? Do you recalculate values which you could store? Are you using strings where a number might work? Are you using Single precision numbers where an Integer would do? Have you looked for a library which might be smarter about computationally intensive logic? Have you built your application into an EXE rather than interpreting it?

  80. Python, R and may be MySQL by pesho · · Score: 1

    You don't specify the scientific field. My experience is from biology and what i can recommend is Python (look at the numpy and BioPython modules) and R (www.cran.org), which is an excellent statistics and data mining tool (again on the biology side it has the bioconductor toolset). MySQL may also come handy to store data depending on the project. I find myself writing some pieces in R, some in Python and using the Rpy2 python module to glue them together. MySQL can also be accessed from both python and R.

  81. Clojure by Anonymous Coward · · Score: 0

    It's lisp on the JVM, has a great statistical package already (Incanter), has pmap (parallel map), and ForkJoin support. Making something use all cores is as easy as changing (map fn data) to (pmap fn data) -- although that's a simplistic answer and some problems require other methods. You can also create an uberjar that bundles all dependencies into the jarfile and send that to the computing institute of your institution, rather than dealing with dependency hell like Perl (and to a much lesser extent Python) when you don't have root access.

    Being on the JVM you also have access to just about all java libraries, Scala libraries, and possibly Jython libraries (I haven't seen nor tested). Interop is amazing, you don't have to know Java (I don't).

    If that doesn't work for you, I say Python, it's quite standard, but for the heavy lifting I really enjoy Clojure and it's libraries. And yes, with parallelization, I got my bad code to kick the ass of highly optimized C code (single threaded) 63hrs in C to 5 hrs on the lab's overclocked queue for gene correlation.

    The laziness of Clojure also helps you deal with very large files that won't fit into memory.

  82. It's the algorithm, not the language you pick by Anonymous Coward · · Score: 0

    If you haven't written any code in 12 years, and aren't even sure which language is best suited for the project you're doing, there is a larger problem.

    The elephant in the room is that the algorithm you concoct to solve your problem is more likely to be the performance killer than the language or platform you pick.

    You'd be better off buying a data structures book or some other language-agnostic text book and learning to be a better coder in a language you already know, than starting at square 1 in a new language, thinking your efficiency and speed problems will magically go away.

  83. lol. by Grog6 · · Score: 1

    http://www.hardocp.com/article/2012/05/08/inside_mind_stuart/

    I saw it years ago, when everyone was wondering if it was real, lol.

    Reminds me, I need some more nyquil...

    --
    Truth isn't Truth - Guliani
  84. Matlab by Anonymous Coward · · Score: 0

    Because you want to do science instead of programming.

  85. data mining by Anonymous Coward · · Score: 0

    Type "data minin"g in Wikipedia and there is a list of open-source and commercial applications for data mining. Python has a great collections of free libraries for research in data mining, if you google "Python data mining".

  86. PDL by swm · · Score: 3, Informative

    Perl Data Language
    The power of Perl + the speed of C

    1. Re:PDL by Roger+W+Moore · · Score: 4, Funny

      The power of Perl + the speed of C

      ...and the readability of machine code?

  87. autotools by Anonymous Coward · · Score: 0

    As someone who regularly has to waste time compiling scientific software.
    Learn autotools and set up proper build environments, scientists re-invent the wheel far too many times in new and surprisingly retarded ways.

    Sometimes I'm amazed at how little reflection that goes on.
    At some point developers doing scientific code should go, "this is silly, maybe we should just re-do the build process".

    Some scientific applications uses the configure script to launch scripts that downloads and patches code and chain fires off a series of other scripts to do all sorts of nasty things, but nothing beats REQUIRING one to install into the source directory, which is actually very common.

    What language you choose, who cares.. use what's the accepted standard in your field, and if that's Java, change fields.

  88. Real men... by Anonymous Coward · · Score: 0

    write code on bare metal... assembler or machine code.

    Or, if you are thinking Inception-esq....

    Outer layer - Python (UI, controls and getting things started).
    Down one - C/C++ (The place where control spends 90% of its time grinding away). Also handles most/all of your intensive I/O
    Down two - Assembler (To make a rocket-sled you need to bring rocket engines. Real rocket engines. Saturn F1's! Not some throw away, short lived JATO). Inline your code where things run slow in C/C++ - but only where things run slow. One slip up and you could be heading full throttle to the Gamma Quadrant.)
    Down three - Hand-tuned machine code. The areas where the Assembler just isn't cutting it...

    Properly done - the job will be so fast it will finish before it started :)

    Just like Inception where each level down becomes more bizarre, time also runs perceptually slower as events occur much, much faster. Those of you who have every programmed assembler - even just for a class at college - know what I mean.

    But, if you don't need all that power/performance/maintenance - Pythyon/C/C++/Fortran/R, etc. will still run a lot faster than VB/VBA.

  89. Matlab by Anonymous Coward · · Score: 0

    If you have access to Matlab licence then I suggest using it. Syntax is almost same as Octave but it is faster and it can do multicores automatically. However, if you need to purchase licence, then maybe some other options are better.

  90. Python by SunTzuWarmaster · · Score: 1

    I am a scientist who dabbles in data mining problems. I use Python with a healthy dose of C++ and the occasional Java. These are probably the three most common languages among the community. I see people using R and Matlab relatively frequently. A bunch of people in this topic have suggested Fortran, but I've never seen anyone use it seriously.

    I haven't run into anyone who who doesn't use a minimum of two programming languages (Python/C++, Matlab/Java, etc.).

    Note that Kaggle.com (the data mining competition site) frequently posts their example solutions in Python. Failure to understand the Python solution starts you out at a healthy disadvantage.

  91. You know you ant it bitches. by Dishevel · · Score: 0

    Perl. That is all. :)

    --
    Why is it so hard to only have politicians for a few years, then have them go away?
  92. APL by Squidlips · · Score: 1

    Back when Men were Men....

    1. Re:APL by Anonymous Coward · · Score: 0

      YES!!!

      And why type an entire keyword when a single greek character will do. iota, rho, all useful operators in APL.

      (And I knew Women who programmed in APL in the 70s, as well.. So I'd generalize to "Back when software developers really knew their stuff... "A Programming Language"... need I say more"

    2. Re:APL by mjwalshe · · Score: 1

      And Magic was Real - "unless declared as integer" :-)

  93. PDL by Anonymous Coward · · Score: 0

    Check out PDL. I've been a happy user for years and have no plans of giving up. It works, is fast enough for Real Work and (at least in my environs) there is a ton of software written using it.

  94. SAS by Anonymous Coward · · Score: 0

    SAS

  95. R with C and Perl by WillAffleckUW · · Score: 1

    We tend to use R, C, some C++, and a lot of Perl.

    But then, we do real science.

    --
    -- Tigger warning: This post may contain tiggers! --
    1. Re:R with C and Perl by biodata · · Score: 1

      We use a lot of perl too. I mentioned python instead in the post above as I hear a lot of labs use it, but perl is faster and I prefer it personally.

      --
      Korma: Good
  96. Computationally intensive data mining by the+eric+conspiracy · · Score: 1

    Sounds like you are working on some sort of similarity search problem.

    You probably find most of your peers are working with C/C++.

    If that's the case I'd go for that language.

    1. Re:Computationally intensive data mining by Anonymous Coward · · Score: 0

      When it comes to data mining, ignore anyone with a low userid.

      Data mining has come a long way and is mostly statistical/machine learning-based now, meaning numerical routines and optimization for fitting models, not just nearest-neighbor. If you go the pure C/C++ route, you'll be rewriting standard stuff for days until you inevitably hit a brick wall, while everyone else is using the standard, modern methods. Being comfortable with writing C subroutines for a higher-level language will give you some serious edge, but foregoing the higher-level language almost puts you out of the running. It doesn't matter how good you are at programming; you can only program so fast, and everyone else will be skipping 80% to 100% of it entirely.

    2. Re:Computationally intensive data mining by the+eric+conspiracy · · Score: 1

      Seriously? Listen to an anon over a low id user?

      Most of the machine learning stuff I've seen is C/C++.

      And this stuff doesn't scale all that well to really large datasets.

      Plus this guy is an academic.

  97. Look at the frameworks. by DdJ · · Score: 1

    You are not going to write everything from scratch by yourself. You're just not. Not if you actually want to get anything done. You're going to reuse code.

    So: figure out what code you're most likely to reuse, what frameworks are useful in the field you're interested in, and let that suggest the language.

    If you don't know how to get started on that: asking the question of peers in the same scientific field will get you a more useful answer than asking the question on a wide-open generic technical forum.

    Another angle: look at what network databases you want to integrate with (eg. protein databases at nih.gov), and look for sample code showing how to access 'em. That'll give you a clue what other practitioners are doing.

  98. Bash by Anonymous Coward · · Score: 0

    I'm a bioinformatician and make use of bash almost exclusively. Throw in some gnu parallel and you're there..

  99. Use what your adviser/group members/colleagues use by allwheat · · Score: 1

    For scientific computing, you will be doing a lot of collaboration and very likely sharing codes with other scientific programmers, very few of whom enjoy learning new programming languages all the time. To simplify/enable collaboration, you should follow what the community uses. In physics, generally that means Fortran. Anything past Fortran90 is basically modern, it's really not too bad to learn and even has basic object-oriented stuff, though not as good as C++. F77 is mostly obsolete and a major pain in the neck, but you will see it around in older codes, as well as a lot of the libraries. There are C/C++/Python/f77/etc codes around, but most physicists use >F90, especially in high performance/parallel computational work. But there are subfields of physics with their own popular tools too. My advice is to go with whatever the majority of your colleagues are using, placing a very big premium on what your adviser and group members use, which is who you will collaborate with the most. What the majority in the field uses is usually suitable for the job anyway.

    It sounds like you're interested in parallel computing as well. Fortran is probably the best option then, mostly for the libraries, but you can still interface from C/C++ or whatever. Also, if you have a lot of computationally intensive stuff, you should try to get supercomputer access. Ask around, you should be able to work something out. You'll need to decide on OpenMP or MPI for parallel programming, depending e.g. on your memory, shared/distributed etc. Here's a quick rundown: http://www.dartmouth.edu/~rc/classes/intro_mpi/parallel_prog_compare.html
    Most scientific hpc (high performance computing/supercomputer/parallel) is on unix/linux.

    What field are you in exactly, and what is the nature of your data mining?

  100. Fortran 2003 + Python + C = profit by excelsior_gr · · Score: 1

    Especially for a beginner Fortran will make the most sense, IMHO. Here's why:
    - User-friendly syntax. Especially vector and matrix syntax and operations are very intuitive.
    - Strong typing will let you catch lots of errors at compile-time rather than let you hit your head against the wall at run-time.
    - Fast in quite a foolproof way (just remember to loop over columns first, or possibly even use the simpler Matlab-like syntax and let the compiler figure it out)
    - Usable with OpenMP and MPI
    - Massive availability of free code on the net (visit netlib). Old code also has very good chances to run out-of-the-box or with very minor changes.
    C is a close second to many of these points. I can't recommend C++ and Java though, as all the clutter will slow you down especially at the beginning. I also like Python a lot, but there's a catch, which brings me to my next point:

    Do you have a plan of your program? How well do you know what you'll be programming? Is that going to change a lot along the way? Code changes are a lot harder to implement in Fortran and C than in Python. The abstraction level in Python really amazed me, as the interpreter would run anything I would throw at it. Mixing paradigms in Python makes scientific programming a breeze of fresh air. There are ways to make it quite fast, too! Mixing procedural and object-oriented programming is possible in Fortran as well, but by far not as versatile as in Python. In any case, if you decide to use the OO-paradigm, you have to make sure you have the whole program figured out before you start so that you can define your objects wisely. I have mixed experience with scientific OO in C++, so I can't really recommend it.

    Output is also a huge topic in scientific computing. It is a pain to make live graphs in Fortran (the intel compiler has a proprietary library that somewhat helps), but you can export the raw data and use gnuplot or another tool for visualization (even Excel for small graphs). For larger 3D+time datasets there is Paraview. These things are much more fun to do in C/C++. On the other hand, you can use the C-interoperability features of Fortran 2003 and combine them! Or use f2py or PyFort and combine Fortran and Python!

    1. Re:Fortran 2003 + Python + C = profit by mjwalshe · · Score: 1

      or go really old school use GINO-F which is how we did it back in the day :-)

  101. You mean Fortran 77? by mbkennel · · Score: 1


    Things have moved on. Fortran 95+ is almost as easy as Matlab, definitely easier than C++ and faster than both.

    And for heavily numerical algorithms it is better designed, beyond just the speed.

  102. The one language question by jgotts · · Score: 1

    If you want to do programming as a career, you need to be flexible enough to be able to pick up any language, so use whatever language you feel comfortable enough using to write maintainable code.

  103. Re:fortran of LaTeX by The_Wilschon · · Score: 1

    LaTeX is a great thing to learn, but it is most emphatically NOT a remotely reasonable choice for writing number crunching code...

    --
    SIGSEGV caught, terminating

    wait... not that kind of sig.
  104. Why not hire someone? by Anonymous Coward · · Score: 0

    When I was in grad school I had your mindset. Now that I'm out, I (thankfully) have a very different one. You're talking about this project taking you weeks. Why not have the lab contract with an actual programmer (maybe even a grad student in the CS department) to write this for you? It'll get done faster, it'll be easier to extend/modify, plus you can do research in the meantime which will move you closer to your goal of graduating.

    1. Re:Why not hire someone? by allwheat · · Score: 1

      seconded. If possible this is a nice route.

  105. "Scientific Computing" is over-broad by FellowConspirator · · Score: 2

    The problem with this question is that "scientific computing" is an over-broad term. The truth is that certain languages have found specific niches in different parts aspects of scientific computing. Bioinformatics, for example, tends to involve R, Python, Java, and PERL (the prominence of each depends largely on the application). Big-data analytics typically involves Java or languages built on Java (Scala, Groovy). Real-time data processing is generally done in Matlab. pharmacokinetics, some physics, and some computational chemistry are often done in FORTRAN. Instrumentation is generally controlled using C, C++, or VB.NET. Visualization is done in R, D3 (JavaScript), or Matlab. Validated clinical biostatistics are all done in SAS (!).

    Python is a nice simple to learn start, very powerful, and the NumPy package is important to learn for scientific computing. R is the language of choice for many types of statistical and numerical analysis. Those are a good place to start, if incomplete. From there, I'd look at the specific fields of interest and look at what the common applications and code-base are for those.

    With regard to the OS, that's pretty easy: Linux (though OS X is a reasonable substitute). Nearly all scientific computing is done in a UNIX-like environment.

  106. Python, R, or Matlab by Anonymous Coward · · Score: 0

    Matlab (closed/paid) if you want a GUI interface to work and plot in. A lot of academics that do matrix operations use matlab as the operations are fast out of the box (you can be just as fast in other languages if you have the supporting libraries appropriately compiled). Many of the matrix operation libraries are only available for matlab (i.e. tensor toolbox, many kernel method routines).

    R (open/free) if you want a function oriented language that is most similar to what scientist often use/see. It generally (changes dramatically by your field) has more mature packages, packages install more easily without sudo/admin, and is closest to a language that "just works" for most scientists. Data frames are part of the core package, so it is easy to hold all your data together in a "logical" frame work.

    Python (open/free) if you want an object oriented language that is close to what programmers want to see. It generally has fewer mature packages (though this is changing quickly, I tend to run into more IT roadblocks installing packages because it installs at the machine level (not user level as with R), but is by far the most modern programming language. Data frames are implemented as additional packages to compete with R (I believe pandas is your ticket for that).

    Any of these points can be nit-picked and you will see a lot of people disagree. However, scientist generally want to code quickly with mature packages (even if the cost is slightly longer run times) and, having recently been in the same position you are in, I find these to fit the bill.

  107. MATLAB by Anonymous Coward · · Score: 0

    I'd look into matlab. Im a current undergrad studying CE and we use it for everything.

  108. QT by Anonymous Coward · · Score: 0

    If you're going to have to do any sort of GUI work, or just want a simple standard library like Java's, C++ with QT is an excellent choice.

    1. Re:QT by burdickjp · · Score: 1

      Python +PyQT or PyGTK is a similar, and probably easier, answer.

  109. One word by vikingpower · · Score: 1
    --
    Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
  110. Python / C++ / Matlab by Anonymous Coward · · Score: 0

    A ridiculous amount of engineering is done within Matlab. I don't think it's a good idea because it's generally as slow as python and proprietary/expensive, but knowing it is probably in your best interest.

    Python, as others have said, has caught on for a lot of scientific computing tasks. I'd recommend the Anaconda bundle of scientific-minded packages.

    A lot of legacy code is done in Fortran, but I don't like the poor standardization among Fortran compilers. Too many support their own extensions and that can cause trouble. C++ with the Eigen numeric/linear algebra library goes a long way to replacing fortran for me.

  111. A genuine suggestion by Anonymous Coward · · Score: 0

    I would consider Python. I read your whole post (unlike some commenters, it seems), and Python is your best bet. It's easy to learn, easy to implement, very effective, and it's very fast...

  112. Also, it is fast by Sycraft-fu · · Score: 2

    In part, this is because Intel has a compiler for it. On commodity hardware (as in desktop, laptop), you will generally get the best performance running an Intel CPU and using an Intel compiler. That means C/C++ or FORTRAN, as they are the only languages for which Intel makes compilers. C++ is easy to see, since so much is written in it but why would they make a FORTRAN compiler? Because as you say, serious science research uses it.

    When you want fast numerical computation on a desktop, FORTRAN is a good choice. We have a few researchers here who use it, and they all use the Intel Fortran Compiler because they want fast computation, but they don't have the money to buy bigass systems for every grad student. What they get out of the IFC and a regular Intel desktop chip is pretty impressive.

    Compilers matter, and Intel makes some damn good ones. So if your research calls for lots of performance on little budget, that can influence language choices. Heck same thing on supercomputers. That is not my area of expertise, but it isn't as though all compilers for a given supercomptuer will be equally good. If I were to bet, I'd say the FORTRAN compilers are some of the better ones.

    1. Re:Also, it is fast by Anonymous Coward · · Score: 0

      Compilers matter, and Intel makes some damn good ones. So if your research calls for lots of performance on little budget, that can influence language choices. Heck same thing on supercomputers. That is not my area of expertise, but it isn't as though all compilers for a given supercomptuer will be equally good. If I were to bet, I'd say the FORTRAN compilers are some of the better ones.

      Uh, not as much as you think. It's not an order of magnitude difference. here are some benchmarks from a very good Fortran compiler. Gnu compilers are nearly always good enough. What you may run into is that there are SIMD instructions that exist on Intel chips that generic compilers may not have access to. On the other hand, you can make use of them from gcc, so it's just a combination of factors.

      (Also, blind use of optimization flags results in troubles, once you accidentally start throwing IEEE math out the window... depends on your application, of course.)

    2. Re:Also, it is fast by Anonymous Coward · · Score: 0

      What you may run into is that there are SIMD instructions that exist on Intel chips that generic compilers may not have access to. On the other hand, you can make use of them from gcc, so it's just a combination of factors.

      It is one thing to have access to them, it is another if the compiler is good at optimizing and autovectorizing your code when possible, and a language can make that easier or harder for the compiler depending on what assumptions and conventions the language enforces.

    3. Re:Also, it is fast by Anonymous Coward · · Score: 0

      Does the Intel FORTRAN compiler deliberately emit suboptimal code for non-Intel chips, like the Intel C Compiler does? If so, and if you have anyone using AMD chips, you can pick up more speed by patching the binary to never use the substandard code paths.

      http://www.agner.org/optimize/blog/read.php?i=49

      https://github.com/jimenezrick/patch-AuthenticAMD

  113. no single answer here. by niftymitch · · Score: 1

    The best compiler support for numbers will commonly be Fortran.

    Python belongs on the list because slow functions can be coded in C
    or another native language for speed. It is also a rich and portable protyping
    language.

    There is value in asking your advisor.

    A linux distro like Centos is well regarded, almost any programming language
    can be downloaded. Switching to Redhat for product support has a small learning curve.

    R is a statistical rich environment that you should be aware of. Python bindings for R exist so
    again Python.

    SUMMARY: Python and R. R may be all you need.... R makes charts and graphs, slices dices....
    runs on many platforms even WinsowZ

    --
    Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.
  114. monty by burdickjp · · Score: 1

    Python Python Python Python Python It already does almost everything you could ask for, and is growing in acceptance and userbase. It is a modern language with modern language structure. It is designed to be human readable and consistent.

  115. Keep it simple by seyfarth · · Score: 1

    Your best bet is to use C. It is highly efficient. If possible use computational code like the Atlas BLAS package. This code will run circles around your own code no matter what language you use. You already know C and moving to C++ is a major problem. All the other languages are distractions from your purpose.

    If possible run multiple, independent processes rather than writing parallel code. That can be a major ordeal.

    If your goal is to process data as opposed to learning elaborate programming techniques, keep simplicity in mind. C is a very powerful language and you can reach maximal efficiency for many problems using Atlas BLAS and multiple processes. If you goal is to get a degree in CS, ignore what I've suggested.

    --
    Ray Seyfarth, ray.seyfarth@gmail.com, http://rayseyfarth.blogspot.com
  116. Matlab will fall, SciPy will rise by steveha · · Score: 2

    If you are working in academia, then you probably have access to Matlab.

    On the other hand, you definitely have access to SciPy, given that it's free.

    I predict that Python with SciPy/NumPy will completely displace Matlab within a few years.

    I say that even though I am working in one industry, digital signal processing, that is really married to Matlab and will be one of the last places to make the switch.

    Because Matlab was purpose-built for scripting with matrices, it has some nice syntactic sugar for that. In every other way, Python as a language is far superior.

    I was able to attend the SciPy conference a couple of years ago, and one thing I heard there: people like that Python works as a universal language. Sysadmins can use Python to do admin tasks; the web site guys can use Python (with Django) to make web sites; the science guys can use SciPy... it's one language that is flexible enough to do anything you might need, and it's much easier to learn than other really flexible languages like Lisp.

    Because Matlab has been around a long time and has man-centuries of work invested in it, it has very complete and well-debugged libraries available for it. SciPy is playing catch-up here. But the basics are already solid, and if SciPy will work for you, you should choose it because it is the future.

    There was a time, not that long ago, when people spent $30 to get a web browser. Now people expect web browsers to be free. I predict in the near future the same thing will happen with Matlab vs. SciPy.

    SciPy has the advantages of being free and open, as well as the advantage of being free as in beer. And Python is just a better language than the Matlab language. Mark my words: Matlab will fall and Python/SciPy will rise.

    --
    lf(1): it's like ls(1) but sorts filenames by extension, tersely
    1. Re:Matlab will fall, SciPy will rise by Anonymous Coward · · Score: 0

      One thing that will prevent Matlab from falling for quite some time in many engineering disciplines is Simulink. There are alternatives - but I don't think any are as good. Add in the fact that you can link into Matlab, a powerful environment itself and Matlab is staying put for quite a while. I do hope that a freeware alternative arrives, but I don't think it's as close as you predict.

    2. Re:Matlab will fall, SciPy will rise by ericcc65 · · Score: 1

      Speaking as someone who used to use Matlab exclusively and almost got a job with the Mathworks let me say that I hope you're right. I love that scipy gives me a general programming language, and I love that it's free.

      But there is one major obstacle before that happens...toolboxes. I'm in the DSP realm too and the signal processing package of scipy just doesn't hold a candle to all the numerous toolboxes that you can get with Matlab. There are some BASIC functions that are missing from scipy and there are a TON of extra functions in the Matlab toolboxes that aren't anywhere close to being implemented in scipy. The communications toolbox alone has so much that scipy doesn't offer it's not even funny.

      I guess I should stop complaining and start contributing. But I honestly don't know that I'm good enough of a programmer to feel like I could contribute something. I guess I'm pretty sure I could implement a few algorithms without major bugs, so maybe I should pitch in. I don't know that it would be the prettiest or most optimal, but you've got to start somewhere.

  117. Of course, I have to add something by Anonymous Coward · · Score: 0

    There is no single language that does all -- so learn multiple. To get back in the game, you might try this new fad, MOOCs. I did (mostly for laughs, alongside much heavier courses involving programming) udacity's programming 101 and it teaches python. Might give it a whirl. Coursera also has lots of relevant courses you can use to help you brush up.

    I've found that python is convenient to throw things together quickly (as is perl, but that gave me a headache, as are things like shell (for scripting, bourne shell, not bash, nor csh), sed, awk, and so on). Still, even with lots of scientific libraries available, it's not something I'd rely on for everything. My usual "tinkering for fun" language is C++, by the by. It's good for speed but to get to it you may need to know more computer internals than you care for. No extant language is suitable as your sole window into programming, so knowing more is an asset, not a burden.

    Oh, don't be afraid to throw code away and re-do it, possibly in a different language. After a bit of practice you'll see why. Also learn how to use a source control system. Regardless of language, that's useful. Try a few (for example, git, mercurial, svn, given in alphabetical order).

    You mentioned OSes, and personally I would pick linux (or, say, FreeBSD, or PC-BSD if you want shiny GUIs) as a platform as it gives easy access to quite a lot of free software. In fact I ran (and run, when I'm not slacking off on /.) those MOOCs on a tiny core linux booted off a usb key, including python and octave, even postgresql.

    You're not a CS major, but since I mentioned MOOCs, if you're going to deal with serious amounts of data then you may want to follow (over time, not all at once) a few more courses. A course in algorithms is useful (and awesome), as is intro to relational algebra and databases. It teaches you SQL, and that's useful to know too.

    Already mentioned are julia, R, octave and a bunch more. My impression with octave was that brushing up your linear algebra is rather useful. You may also want to take a look at pspp and scilab. Which you'll pick to actually use probably depends on who you're working with. Depending on just what you're doing, there's also data visualisation programs you may want to look into.

  118. bioinformatics and computational biology by Anonymous Coward · · Score: 0

    Perl is still in wide use.

    Do not use Perl for this. I've been using Perl for 15-20 years, and I love it for "scripting", text processing, etc., but using it for scientific computing sounds like an exercise in masochism.

    A lot of bioinformatics and computational biology uses Perl, so if you're working in those areas you're going to run into a lot of it.

    1. Re:bioinformatics and computational biology by ebno-10db · · Score: 1

      Truth is stranger than fiction. Even as a Perl lover, I would never have thought of using it w/ number crunching. Still doubt it would be my first choice.

    2. Re:bioinformatics and computational biology by Mike+Buddha · · Score: 1

      When searching and sorting DNA sequences, it's just [ACGT]* in a text file. Why wouldn't Perl be ideal?

      --
      by Mike Buddha -- Someday the mountain might get him, but the law never will.
    3. Re:bioinformatics and computational biology by ebno-10db · · Score: 1

      I think of that as text processing rather than number crunching.

  119. Fortran, Matlab by Anonymous Coward · · Score: 0

    For easy but slow number crunching, use MATLAB/octave.
    For simulation (weather, fluids, materials, FEM), use fortran.
    For stastical analysis, use R (I think--I've never used it).
    For general purpose, use python.
    For communication with hardware (sensors, IO cards collecting data) use C/C++.

    You'll probably have to make some compromise, because your work will cross boundaries. Maybe interfacing with the hardware won't actually require using C. Or maybe it will, but the libraries you need are for a different language.

  120. Solved by piRSqrd · · Score: 1
    I recently had to answer this same question for myself and my group. I reduced the problem to evaluating the trade offs of the following dimensions:
    • 1. Ease of reading/programming/maintaining by non-professional programmers.
    • 2. Accessability.
    • 3. Well supported/documented libraries.
    • 4. Quickness.

    #1 reduced the field of choices (IMO) to * Matlab/Octave * R/S+ * SAS * Perl * Python * Julia
    As for #2 gives preference to Python, R, Julia, Perl, or Octave (Your situation may not be as limiting).
    #3 led me to many searches that all indicated that R and Python have a rich set of libraries and lots of community support.
    As for #4 From Julia's website http://julialang.org/ they show nice benchmark information that indicates that Python is pretty quick.
    My conclusion was that I couldn't really go wrong between R or Python. However, I chose Python because it was quicker, I like the syntax better, I like the libraries better (NumPy, SciPy, Pandas, Matplotlib) and is seems to play nicer with everything else. This is what worked for me and how I went about deciding.

    --
    I put the 'Physics' in 'Physical Attraction'
  121. Investigate Center for Open Science, framework by myvirtualid · · Score: 1

    In addition to the excellent comments previously made, consider investigating the Center for Open Science, specifically their information for developers, and the associated Open Science Framework (note: will display only if cookies are enabled; I've no idea what value they provide in this context and will be contacting them about that).

    They may not have anything that can help you. Or they might. Or you might be able to help them. Or not. YMMV, etc.

    Worth taking a peek, anyway.

    --
    I'm here EdgeKeep Inc.
  122. Look for what's already there in your area by mike449 · · Score: 1

    The best environment is one that already has the stuff for your application, so that you just cobble together calls to code written by somebody else. Perl + CPAN was the winning combination for many years. These days it seems to be Python+numpy/scipy/scikits.

  123. Depends, but Matlab/Python are mostly great by istudyplanets · · Score: 1

    I would say it largely depends on what people in your field use. I use Matlab on a desktop for data analysis and Fortran/Python for HPC number crunching (astronomy/planetary science). Recent releases of Matlab have seen heavy optimization in number crunching and the parallel processing toolbox is incredibly simple to use. The plotting and graphing tools are second to none and very intuitive if you want to visualize multi-dimensional datasets. For integration of visualization, editing and debugging in one scientifically-oriented IDE, it can't be beat. Plus it sounds like you're familiar with GNU Octave. Python is a better language in my opinion, but lacks some of the 'do-science-straight-out-of-the-box' feel that Matlab is good at. Python obviously has the advantage of being free. The best scientific package is the Enthought Python Distribution which integrates their Canopy IDE with numpy, matplotlib and other great python modules. Free licenses are available to student/academic users.

  124. Scientist here: IDL, Matlab, SQL, C++, Perl. by Remus+Shepherd · · Score: 1

    Where I'm coming from: I'm a satellite physicist working as a contractor for the USGS on the Landsat program. I work very closely with NASA.

    Almost all the scientific programming we do -- and by 'we' I mean USGS and NASA -- is either in IDL/ENVI or Matlab. They're the defacto standards for scientific processing. We do need to know SQLPlus to get our data out of the databases, and we need rudimentary C++ skills in order to make prototype code for the IT coders to turn into an operational release. Sometimes it's easier to code something in C++ then IDL or Matlab, so it's nice to be able to jump straight to that when warranted. Add Perl for text manipulation (which always turns out to be useful in some way) and that's all the programming I've done for the past ten years. Many scientists in the building swap out ARCGIS or ERDAS for IDL/ENVI. (Matlab doesn't seem to be swappable; you either need to use it or you never touch the stuff.)

    I've dabbled in Php when they asked me to prototype a web site but that never went far. I've done a little Flash programming that they eventually decided to hire out for. (I did a fine job, but they wanted the application to go bigger.) In the early days of my career FORTRAN was everywhere, you couldn't get away from it. There are still some FORTRAN programs in-house that I could fiddle with if they asked me to, although I'd blanch at the prospect.

    All that said, what you need depends on what your role is. If you're a scientist like me then these self-taught languages might be enough. If you're a science-oriented IT person, you'll need more -- most importantly strong C++ skills, at least around here. And different disciplines will have different needs; I worked briefly for NIH (National Institutes of Health) and they still had COBOL programs.

    I know of one person in two organizations (USGS and NASA) who knows Python, and he's an IT guy not a scientist. He's also the only person I know who has ever used Hadoop. I have never met anyone who knew R. Visual Basic is used occasionally here and there for prototyping, and almost immediately switched out with C++ as soon as management decides to support the project.

    --
    Genocide Man -- Life is funny. Death is funnier. Mass murder can be hilarious.
  125. VB is too slow for you? C++ then... by bobbied · · Score: 2

    I suspect that VB is NOT your problem here. But, if you have a VB program that is too slow, then I'm going to suggest you do the following:

    1. Profile your program and see if you can figure out what's taking up all the processing time. It may be possible to change the program you already have slightly and get the performance you need. It would be a shame to go though all the trouble to learn a new language and recode the whole thing if replacing some portion of your code will fix it. Do you have a geometric solution implemented when a non-geometric solution exists?

    2. Consider adding hardware - It's almost ALWAYS cheaper to throw hardware at it than to re-implement something in a language you are learning.

    3. Rewrite your program in VB - This time, looking for ways to make it perform faster (you did profile it right? You know what is taking all the time right?) Can you multi-thread it, or adjust your data structures to something more efficient?

    4. Throw hardware at it - I cannot stress this enough, it's almost ALWAYS easier to throw hardware at it, unless you really have a problem with geometric increases in required processing and you are just trying to run bigger data sets..

    5. If 1-4 don't fix it, then I'm guessing you are in serious trouble. If you really do not have a geometric problem, You *MIGHT* be able to learn C/C++ well enough to get an acceptable result if you re-implement your program. C/C++ will run circles around VB when properly implemented, but it can be a challenge to use C/C++ if your data structures are complex.

    6. Throw hardware at it - seriously.

    Unless you really just have a poorly written VB program or you are really doing some geometric algorithm with larger data sets (In which case, you are going to be stuck waiting no matter what you do) getting better hardware may be your only viable option. I would NOT recommend trying to pick up some new language over VB just for performance improvement unless it is simply your only option. If you do decide to switch, use C/C++ but I would consider that a very high risk approach and the very last resort.

    --
    "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
  126. Matlab by Anonymous Coward · · Score: 0

    Matlab

  127. C. Obviously. by RandCraw · · Score: 3, Insightful

    You know C. C is simple, as fast as any alternative, it's straightforward to optimize (aside from pointer abuse), and you always know what the compiler/runtime is doing. And threading libraries like pthreads or CUDA are best served via C/C++. Why use anything else?

    Another thought: scientific libraries. If you need external services/algorithms then your chosen language should support the libraries you need. C/C++ are well served by many fast machine learning libs such as FANN, LIBSVM, OpenCV, not to mention CBLAS, LinPACK, etc.

  128. What does everyone else do? by Anonymous Coward · · Score: 0

    What do the other grad students in your field do? What does your advisor do? You can accomplish your goals with one of many different tools or languages, and the truth is that there are several good alternatives, many of which have been mentioned here.

    If you choose the same tool as your colleagues, your life will be much easier.

  129. Metric by drakesword · · Score: 1

    ... that is all

  130. nothing beats c++ by Piotrecles · · Score: 1

    I work in gambling and we write everything in c++. It's as fast as anything else, and the new standard makes everything a lot easier. Threads especially. You also have tons of libraries at your fingertips: GSL, good RNGs, whatever you can think of. I think C is probably the worst thing to write in if you're just coming back to it. You'll end up spending more time with memory management than you will actually getting stuff done. Go with C++ and use standard containers.

  131. R, Matlab/Octave or Python with Pandas or Numpy/Sc by joeblog · · Score: 2

    My experience at this comes from being a MooC addict where some of the courses are in Python, others in R, and others in Matlab or its GNU counterpart Octave.

    Of these Python is my favorite since it's the language I'm most familiar with. Furthermore, you can "bolt" R to Python with the Pandas library, and you can "bolt" Matlab/Octave with the Numpy & Scipy libraries.

    A big drawback, however, is speed. The big advantage of domain specific mini-languages over "kitchen sink" languages was brought home to me by writing a Python script to simulate the popular (in statistics courses) Monty Hall problem and the same script in R. While my Python script took several seconds to simulate a couple of thousand Monty Hall game turns, the R script would give the percentage for millions the instant I hit the enter key.

    More complicated problems ended up with weird bugs in R scripts I couldn't figure out, whereas (because of my better familiarity with Python's "mutable list" problems) I tended to get correct -- albeit slower -- answers from my Python programs.

    Re Octave: whereas R has overtaken commercial versions of S, I've written off Octave as a lame "freeware" version of Matlab -- lots of features are missing, the documentation is frustrating (it seems to only be used by universities, so "gurus" on stackoverflow etc automatically assume any question is some student trying to cheat at homeworkd) so I'm not a fan. But if I knew Octave as well as Python, I might like it.

    R, on the other hand, has an obvious speed advantages for the problems it's aimed at, and a probably a better selection of specialist libraries for statistical problems. But it's full of strange quirks for non specialists.

    --
    If it works, it's obsolete
  132. Fortran by hooiberg · · Score: 2

    I have worked for almost a decade in scientific computing, and it is Fortran everywhere. Make sure you get up to the new standard. Contemporary Fortran is not the same as Fortran77. Many problems typically associated with Fortran are things of the past. ;) Next is C. Moreover, Fortran is a fairly easy language to learn. Avoid all object oriented stuff. For scientiicf computing, this is never used, and even shunned to a great degree. Avoid c++ and C# and all that stuff. When you work in SciCom, you will never see that anyway.

  133. Re:More details? (R-is pretty strong) by EngineeringStudent · · Score: 1

    Best is meaningless without a measure of goodness. (from Optimization) You are going to get a slew of candidate bests but folks aren't going to often articulate what makes it best. there will conflicting or even mutually exclusive rubrics.

    The goal of the language might include:
    - inexpensive (starving college student budget)
    - employable (typically used and valued in your post degree career)
    - fast enough (not every grad student needs to run on a supercomputer to get their job done)
    - great breadth and depth of libraries

    IMO the "R" language does some of these really well.
    - It imports into JMP, SAS, and Python so you can wrapper it for your job.
    - It is engineered and maintained by stats/math grad students so it is wide, deep, and mostly correct
    - It is open source so it is free

    Personally I use MatLab, which was taught in school and it hurts for the following reason:
    - where I work is JMP-dominant, so it is pulling teeth to stay in the $5k/yr CAL.
    - nobody else here speaks the language (statistically speaking) so I have to do extensive hand-holding to share the code
    - If I am not connected by VPN to the work CAL server, I can't turn on my software

    As long as I am not doing CFD I find the interpreted language is good enough. Computers today are much better than the supercomputers of 15 years ago. We have smart-phones with better CPU's than a bleeding edge Pentium II yesteryear.

    I particularly like RStudio as an IDE.
    http://www.rstudio.com/

  134. a fair point... Re:old xeon box? linux vs. xp by Fubari · · Score: 1
    That is a fair point r.e. reboot vs. data access.
    I was thinking of what could give the op a performance boost while staying on a ramen budget.
    *shrug* without knowing more it is really hard to say.

    And I would beg/buy/borrow/steal a modest SSD to run the OS on, you can probably get both for $100 or so. Keep your data sets on the slower spinning-rust drives.

    If he's going to keep the data sets on the spindles then I see no reason at all to invest in a SSD. All calculation takes place in ram, it is loaded and written to spindles... Yeah the computer will boot in 15 seconds instead of 75, but how often is this thing going to be rebooted?

  135. LabView by Hillgiant · · Score: 1

    Leave programming to the programmers. If you want to get science done, use LabView.

    --
    -
  136. C? Python? Fortran? No, no, no... by Anonymous Coward · · Score: 0

    J is your language. Not because it's easy nor fast to learn... in fact it is a language for bravehearted people, but a breath of fresh air for the mind.
    Otherwise, take the "library" approach: choose the library that will easy your task and a language (not C) you can best work with that library.

  137. Re:Depends... On the Data... by Anonymous Coward · · Score: 0

    In general for flat out speed, toss interpreted languages out (Perl, Python, Java, etc.) the door. You'll want something that compiles to machine code,

    a) Python can be compiled to machine code.
    b) Java compiles to machine code automatically with the JIT compiler that's been built into the JRE/JVM since... forever. The reason you don't use Java for scientific computation is its lack of some IEEE floating-point types/semantics.

  138. Depends on the level of flexibility you need by mitkey · · Score: 1

    If you need to be very flexible, which is typically when you are doing research from scratch -- devising/changing your algorithms often, visualizing the data, etc., I'd suggest MATLAB. It allows you to program and evaluate stuff very quickly. If you are able to vectorize the problem you are solving, it is also very fast, since it uses highly optimized vector/matrix handling libraries.

    Once you know what you want to do and how, you might want to implement your stuff in other languages, as MATLAB is cumbersome, if you for example need to process text or perform networking ... or actually do anything that cannot be vectorized. In such a case one choice would be Python, that has lots of libraries for everything.

    As for C/C++ (or even Fortran :-o), I would avoid these unless you need to address a bottleneck that cannot be solved by use of an optimized library. And even in such case, I would only rewrite the bottleneck in it, nothing more, and interface with higher-level languages. Programming in C/C++ is literally a minefield for beginners. Updating/refactoring your code in C/C++ takes much more time than in higher programming languages, as you need to take care of many issues related to low-level programming (compared to Python or Java, even C++ is a low level language). Actually I'm surprised that so many people recommend it.

  139. Use what you already know by Linzer · · Score: 1

    You did a fair amount of C? Just refresh your knowledge of C, and you'll be back to business in a few days.

    --
    Gravitation is a theory, not a fact.
  140. Re:fortran of LaTeX by MouseTheLuckyDog · · Score: 1

    Why not? It's turing complete!

  141. Re:VB is too slow for you? C++ then... by excelsior_gr · · Score: 1

    6. Throw hardware at it - seriously.

    This must be the most lame piece of advice on the whole page. If we were having this discussion back in 1990, then yes, "throwing hardware at it" could be the way to go. But now PCs are thankfully in the 64bit era so programs are allowed to allocate a lot of memory (of which there is always plenty in a modern system), SSDs offer fast I/O, your standard CPU has 4 cores or more and the GHz race has stagnated. And this is a very typical PC that you can get for less than 1000 bucks. So what is he going to do to "throw more hardware at it"? Get more memory? This is not going to help unless his program is paging. Should he get more cores or distribute the computation over a network? And he is going to do this with VB how, exactly? Oh, and he never mentioned VB, he mentioned VBA, which is interpreted. By Excel. Yikes.

    Using the right algorithm is correct, but it's not going to help much in this case. He'll get a moderate speedup whereas picking the right language will reduce the rum-time by orders of magnitude.

  142. Re:fortran of LaTeX by excelsior_gr · · Score: 1

    I would love to see it done though ;-)

  143. Re:C. Obviously. by GiganticLyingMouth · · Score: 2

    It should also be noted that as of C++11 threading is part of the C++ standard library (so you usually won't have to use pthreads or any other platform-specific threads directly).

  144. Don't be silly.. by h8sg8s · · Score: 1

    Don't be silly, you can write FORTRAN in any language..

    --
    Organization? You must be joking..
  145. Learn a commercial language by Anonymous Coward · · Score: 0

    You are more likely to end up in IT than in science, so use an language that is used in IT

  146. Scattered Opinions by Anonymous Coward · · Score: 0

    Obviously, it depends on what you're doing, but without knowing that, some general comments:

    If I we completely hands-on ignorant of all programming languages, and could pick exactly two to achieve extremely high proficiency in, they'd probably be C++ and Python. A third might be a modern Fortran.

    The C++/Python combination gets you Scipy, Numpy, and C mastery for free, and access to extensive user communities and libraries for both, including good linear algebra packages. (Adding Fortran into the mix will aid the linear algebra situation as well.) If you know C/C++ very well, and are comfortable with thinking about operations at the hardware level of whatever you're using, then you will also be able to pick up CUDA pretty quickly if you happen to find yourself working on something easily parallelizeable. Your post indicates you may be interested in parallelization, but be warned the CUDA/GPU processing is not a magic wand; it comes attached to a specific hardware model which is fucking fantastic for some applications, and only so-so for others. But if you think you want to go that route, then you'll end up learning C/C++ anyway, is my guess.

    (Anyone wishing to dispute the point about CUDA is invited to survey the literature and point me to a package for sparse vector-matrix multiply that shows the same performance gain over a CPU svmm package as one can readily achieve for the dense versions. No prior assumptions over the type of sparsity are allowed. A package for tensor manipulation of arbitrary order; bonus points for efficient sparse tensor handling. Seriously, help a guy out.)

    I might also urge some caution with Fortran. I know modern versions of Fortran support recursion, but when I learned it, I learned on F77. F77 compilers did not UNIVERSALLY support recursion, and versions before that generally did not. I have very little detailed knowledge about how many modern Fortran libraries use recursion, which old ones have been upgraded to support it, etc. An expert would know. But it seems like something that could bite a fellow in the ass.

  147. depends Fortran or Posibly Python by mjwalshe · · Score: 1

    If its really heavy compute intensive I would say use fortran its relatively a simple language to pick up rather than C++ or C.

  148. Quantifying Java performance by Latent+Heat · · Score: 1
    I tried various Java profilers and they seemed to offer too much "instrumentation overhead" for what I was doing.

    For Java, I use System.nanoTime(). For C++, I use the Windows-specific QueryPerformance() call. So

    The technique profilers claim to use is to calibrate the overhead of something like System.nanoTime() with a loop and then subtract the estimated overhead from your instrumented code. The overhead to the nanoTime() calls can even be larger than the execution time of code segments you are trying to measure, but if your estimate is accurate, this works.

    I am doing plain add-subract-multiply-divide in a mix of scalar and looped operations on arrays -- if you are doing largely trig or calls into a numeric library, you are timing loops and library calls, not the intrinisic performance of your language.

    I am also doing a lot of the OO version of using global variables. Java is supposed to do "escape analysis" where if you allocate inside a method and don't let a reference pass outside, the JIT is supposed to recognize that as a local-context stack allocation, but I am not using an advanced Java version or the right set of JVM flags to get that to work.

  149. two obvious options by cas2000 · · Score: 1

    1. get back in to Fortran - especially if you'll be working with other researchers on existing code.

    2. python, with the scipy and numpy libraries.

  150. Python Spyder by srobert · · Score: 1

    Python is being suggested frequently here. If you do go with that, I'd strongly suggest taking a look at the Spyder IDE:
    http://code.google.com/p/spyderlib/
    It's especially useful for scientific work and entirely cross-platform. I even have it running under FreeBSD.

  151. Re:VB is too slow for you? C++ then... by bobbied · · Score: 1

    But, my point is that if you have a properly written program, even in VB, the performance gain from recoding into C/C++ is not going to be all that great considering the effort involved. If time is money (and it usually is) then throwing hardware at the problem is a cost effective solution that has been used for decades to get less than efficient solutions to market.

    Please note the ORDER of what I suggest. Always fully evaluate your program's performance weakness and KNOW what is causing the bulk of the problem in your system. This is ALWAYS first. KNOW how your solution scales and why your performance is what it is. I'm just guessing here, but I'll be willing to BET that the issue is not his choice of tools (visual basic) but either how it was coded and/or the nature of the problem. VB is not the fastest solution for data processing out there, but it's not a total disaster in performance either.

    If you are seriously suggesting that VB is orders of magnitude slower than C++, I'm going to object. (And I'm an old C programmer with decades of experience who hates VB.) If you use VB properly, it's not great, but it's not a total dog either. You should get *some* improvement but not 10X better. Further, I'm going to claim that a novice C++ programmer is extremely unlikely to be able to punch out performant C++ code that has any kind of data structures to process. So the situation is we have some performance improvements possible, but we also have a novice programmer.

    Both of these issues tell me that the least risky way to take a reasonably well written VB program and improve it's performance is to throw HARDWARE at it. Throwing programing resources who don't know C++ at it to convert it is a way to spend a lot of money/time and get nothing to show for it.

    I hate throwing hardware at problems too. I've seen it done many times and it seems a waste. But I've also seen projects flounder because they where hesitant to re-spin the processor card and add that extra memory or faster processor where we spent many hours wringing out a few more bytes here and making that interrupt routine a few cycles shorter there. Of course it was really expensive to change custom hardware, so sometimes you just have to make it fit. Off the shelf hardware is CHEAP, and often it makes the most sense when you consider how much programming effort costs. I could be wrong, but in this case, I'd recommend trying to fix the VB code first, then throw hardware at it

    --
    "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
  152. Re:what the rest of your team .. "goooo TEAM!" by Anonymous Coward · · Score: 0

    "...seriously question the quality of your graduate program." Please elaborate; specifically, is this a simple personal opinion, or do you claim some professional authority in evaluation of graduate programms?

    The following is nothing but personal opinion -informed by thirty five years of observation of real-world technical problem solving.

    "Teams" have their uses... ;
    they are useful for 'doing', but it takes smart Individuals to figure out what needs to be done; especially when 'what' has never been done before.

    Team => implementation / production
    Individual => Creativity

    This is not to say that Individuals cannot work toward their respective goals in a mutually supportive manner...

  153. Julia is build for Scientific Computing by AeiwiMaster · · Score: 1

    Take a look at Julia. http://julialang.org/
    It is almost as fast as C, which make it much faster than Matlab, Octave, R and Python.

  154. try Fortran 2008 by Anonymous Coward · · Score: 0

    and look into the implementation of co-arrays, functional programming, and most of the standard OOP/OOD functionality. THe one thing they are talking about implementing in the 2015 standard is programming by contract. Take a look at "Scientific Software Design: The Object-Oriented Way" by Damian Rouson, Jim Xia, and Xiaofeng Xu. Much of the 2003 and 2008 standards are implemented in gcc-4.8.0 and 4.8.1. You can also look at the features and compatibility charts:

    http://fortranwiki.org/fortran/show/Fortran+2008
    http://fortranwiki.org/fortran/show/Fortran+2008+status
    http://fortranwiki.org/fortran/show/Fortran+2003
    http://fortranwiki.org/fortran/show/Fortran+2003+status

  155. Java by maroberts · · Score: 1

    Reasonable performance - better than Python, Perl, PHP, not much worse than C/C++ or Fortran
    Object Oriented,readable and easy to learn quickly.
    Modern day language
    Widely understood in Educational field.
    Can test your code on your Android phone :-)

    --

    Donte Alistair Anderson Roberts - hi son!
    Karma: Chameleon

  156. For earth science by Anonymous Coward · · Score: 0

    Fortran, plus some python.

  157. Re:VB is too slow for you? C++ then... by confused+one · · Score: 1

    he said he was using VBA, which is not fully compiled code and is probably a set of Excel macros. VBA in Excel isn't particularly fast and is single threaded. If, however, he moves his code to one of the compiled versions of VB then he will see a performance boost and be able to spawn multiple threads.

  158. Mathematica by Anonymous Coward · · Score: 0

    MATLAB, and Maple are for faggots. You can use free alternatives, buy if you are going to be paying, pay for Mathematica.

  159. Analog patch panels by Anonymous Coward · · Score: 0

    Last post!

  160. Fortran77? by Anonymous Coward · · Score: 0

    Are you a Neandertal? Current Fortran standard is from 2008, and there are at least another two since F77. At least if you're trying it for performance use the latest version, or are you still using Linux 1.x.x or Windows 2?

  161. use what others in your field use by Anonymous Coward · · Score: 0

    use what others in your field use. or better yet: https://en.wikipedia.org/wiki/Root.cern C++ interpreted and compiled. It's great.

  162. Python along with C/C++ or FORTRAN by amicitas · · Score: 1

    I do scientific programming for a living (Fusion Scientist) and have extensively used a lot of different languages in my research including:
    Python
    IDL
    MATLAB
    FORTRAN
    C/C++
    Ruby

    When working on my own project my favorite setup is to use Python along with Scipy/Numpy. When I need extra speed I use Cython, and also use Cython to interface with libraries that are written in C or C++. For interaction with my codes I use ipython (assuming I need command line interaction) or QT (assuming I need GUI interaction). For libraries written in FORTRAN I use f2py. For plotting I use matplotlib.

    This setup works very well for me. It is fast and powerful, almost completely platform independent, and has excellent mathematical and scientific library support. It is extremely easy to integrate C/C++ or FORTRAN code into a Python project which can be extremely useful. It is also very straight forward to do basic multithreading and parallelization. Interactive debugging is very easy and can really help both in the development and in finding problems with scientific calculations. Plotting support is fantastic and easy.

    I would say that the next best option is MATLAB. This has good support, an excellent mathematical and scientific library and good plotting tools. I don't particularly like the language for large and complicated projects, and it does require a license, which can make it difficult or impossible to share codes between institutions.

    Working directly in C/C++ or FORTRAN is fine for certain kinds of large projects, but is inconvenient for working on lots of small projects or numerous related calculations. Doing something simple like creating a plot requires a significant amount of programming, and debugging can be very time consuming.

    I would stay away from IDL; while a nice language in many respects, it is quite out of date at this time and is no longer well supported in terms of staying current. Ruby does not have sufficient support in terms of math/science/plotting libraries at this point to work well for scientific programming.


    At the end though it does really matter what kinds of projects you will be working with and what your final goals are. It also matters who else you will be working with to make sure that code can be easily shared.

  163. Obviously by Anonymous Coward · · Score: 0

    Haskell

  164. Hire someone by Anonymous Coward · · Score: 0

    If you can't code it yourself, then hire someone. Having said that, if you want speed, C is close to your best bet, although you could argue that Fortran and Lisp are also good choices. They are all quite old languages. They can all be optimized heavily. Notice I didn't say anything about python or java or anything with a ++ in it. Python and Java are interpreted languages. They run slow because they can't be compiled into native assembly that the computer runs natively. Yelp about Java byte code all you want, its still dog slow compared to a language like C. Oh, and there is not good reason to use C++. Numerical recipes don't benefit from object orientation, and may suffer from it. C is more deterministic. The single best thing you can do though, is use the fastest algorithms available. Languages aside, algorithms win more than anything else. Compiler or not can't compare to algorithms. High clock speeds can't compare to algorithms. Many years ago I wrote a program in a language called REXX (an interpreted language), to compute exponents to very large values. Example: 123456789.123456789 ^ 123456789.123456789. (Yes, a 9 digit -pre-decimal- exponent). On a 40 MHz cpu with 2 MB of ram, it would give the (correct!) answer in about 1 second, with a good algorithm. With a crappy algorithm, just doing the non-decimal part of the exponent on a quad-core 2.66 GHz processor could take years (like multiplying a value by 123456789.123456789 in a simple loop 123456789 times: stupid).

  165. Perl works great for my project by Anonymous Coward · · Score: 0

    I've spent a few man hours using Perl to create my truss app, works great. I was a little worried about matrix manipulation but sure enough there is module for everything. I've been using Perl since 1998.
    My app is hosted here: http://design.medeek.com/calculator/calculator.pl

  166. Re:VB is too slow for you? C++ then... by Anonymous Coward · · Score: 0

    VB != VBA

  167. try labview by Anonymous Coward · · Score: 0

    try labview. It's designed for scientists who want to write software but don't care about how computers work. Also, you can use multiple cores with very little effort. Downside: it costs money. But I think it's quite cheap for students, and evaluation is free.

  168. Re:VB is too slow for you? C++ then... by excelsior_gr · · Score: 1

    Read the summary and my last comment again, carefully, and the comment by user "confused one" right below. He is using VBA not VB. Visual Basic for Applications. Not the same thing. And yes, a re-write from VBA in any compiled language will get him at least a speedup of one order of magnitude, maybe two. Add to this your advice on algorithms and he won't know what hit him.

    It's not that I don't like updating the hardware because of cost or whatever, it's that I can't add more hardware any more. For a single threaded application in VBA there is nothing you can do to make it faster anymore hardware-wise. Again, algorithmic optimization is to the point but "more hardware" isn't. Even massively parallel applications hit a wall at some point where the data distribution and communication costs start to outweigh the speedup you get by the extra processors. BTW, I found out that VB can also be parallelized natively as well, but the re-write alone from VBA to VB will do the trick.

  169. code by Anonymous Coward · · Score: 0

    Assembly

  170. Design problem by Anonymous Coward · · Score: 0

    I different language will not make bad design decisions go away. I have at times written things in a sloppy but expedient way, then gone back and spent the time to re-write it with some thought and it will perform 100's of times faster.

    I would go so far as to say that proper algorithm design will make a larger impact than a change to any of the languages listed here. Even in the case where a compiler does a particularly bad job at one or more types of operations, these tend to be well known and work-arounds exist.

    Maybe the solution is to have someone from CS come review your code? Maybe they can sort out the problem in an afternoon and saving you the hassle of re-writing the thing.

  171. Re:C. Obviously. by Teckla · · Score: 1

    You know C. C is simple, as fast as any alternative, it's straightforward to optimize (aside from pointer abuse), and you always know what the compiler/runtime is doing. And threading libraries like pthreads or CUDA are best served via C/C++. Why use anything else?

    This is just nonsense, and to see it constantly repeated and modded up is just sad.

    C is only simple in the same way a written alphabet with only two letters is simple: sure, you only have to remember the letters A and B (simple!), but actually using it is not simple.

    For crying out loud, in C, you can't even do A = B + C; without having a very good chance of invoking undefined behavior. Why? Because in C, overflow or underflow on signed values has undefined behavior!

    Access beyond the end of an array and damage data elsewhere in the system (making it often really hard to find)? No problem!

    Laboriously managing your own memory (and probably leaking it)? No problem!

    What, real strings? Heck no, real men like to take the risk of overflowing the strings and their buffers!

    C is filled with literally hundreds of mine fields just waiting to trap the unwary, and often forces you to write a lot of code that would only be a few lines in a higher level language.

    C is not simple to use. C is not simple to use.

  172. Javascript/NodeJS: speed is overated/pointless by fygment · · Score: 1

    Javascript is the language of the UI of the future: the browser.
    Javascript is fast enough: 2x C/C++ (on par or faster for some tasks).
    Javascript is ... lacking libraries BUT it can call C/C++ routines or the latter can be converted to JS/NodeJS using Emscripten/LLVM.

    Speed only makes sense in real-time apps, say day-trading or control systems. If you are crunching numbers, what difference does an hour or a day or a week make? Seriously, if you are sitting around idle while your numbers a crunching, you are a waste of space. You should be:
    planning the next experiment,
    reviewing/refactoring your code for correctness and efficiency,
    writing the paper in which your results will be published (esp. intro, background, experimental set-up/procedure),
    setting up the website on which you will publish the pre-print so others can review your work and maybe prevent you from publishing foolishness,
    fleshing out your next steps to follow your results,
    thinking of your next great hypothesis.

    Screw speed, it will come with faster processors. Code, make your code correct, and make it accessible and shareable. Javascript/Nodejs will do that in spades.

    --
    "Consensus" in science is _always_ a political construct.