Slashdot Mirror


Julia Language Seeks To Be the C For Numerical Computing

concealment writes in with an interview with a creator of the (fairly) new language Julia designed for number crunching. Quoting Infoworld: "InfoWorld: When you say technical computing, to what type of applications are you specifically referring? Karpinski: It's a broad category, but it's pretty much anything that involves a lot of number-crunching. In my own background, I've done a lot of linear algebra but a fair amount of statistics as well. The tool of choice for linear algebra tends to be Matlab. The tool of choice for statistics tends to be R, and I've used both of those a great deal. But they're not really interchangeable. If you want to do statistics in Matlab, it's frustrating. If you want to do linear algebra in R, it's frustrating. InfoWorld: So you developed Julia with the intent to make it easier to build technical applications? Karpinski: Yes. The idea is that it should be extremely high productivity. To that end, it's a dynamic language, so it's relatively easy to program, and it's got a very simple programming model. But it has extremely high performance, which cuts out [the need for] a third language [C], which is often [used] to get performance in any of these other languages. I should also mention NumPy, which is a contender for these areas. For Matlab, R, and NumPy, for all of these options, you need to at some point drop down into C to get performance. One of our goals explicitly is to have sufficiently good performance in Julia that you'd never have to drop down into C." The language implementation is licensed under the GPL. Lambda the Ultimate has a bit of commentary on the language, and an R programmer gives his two cents on the language.

27 of 204 comments (clear)

  1. Sage by donaggie03 · · Score: 5, Informative

    I use Sage quite a bit. It's basically a wrapper for almost all the mathematics software available. http://www.sagemath.org/ While you still need to drop down to C for great performance, it solves a lot of the interoperability issues discussed. In other words, take the example from the summary: from Sage, you can call Matlab commands and then immediately use the results with R commands. Sage works through a web browser, and it's based on Python, which is a plus.

    --
    Three days from now?? Thats tomorrow!! ~Peter Griffin
  2. Not This Again by eldavojohn · · Score: 3, Informative
    You know, I bet that if you spent as much time learning new languages as you did bitching about duplication in Dart, Julia, $NEW_LANGUAGE then you'd have a pretty powerful array of tools to use in programming. From one of the authors of Machine Learning for Hackers' blog:

    In my opinion, the new code in Julia is easier to read than the R code because Julia has fewer syntactic quirks than R. More importantly, the Julia code runs much faster than the R code without any real effort put into speed optimization. For the sample text I tried to decipher, the Julia code completes 50,000 iterations of the sampler in 51 seconds, while the R code completes the same 50,000 iterations in 67 minutes — making the R code more than 75 slower than the Julia code.

    That certainly caught my attention!

    The XKCD comic you cite is correct for some standards but software languages are much more complex than standards and, in fact, many of them implement common sets of core standards. Once you get specific enough, you're not talking about a standard but rather a specific implementation of how to accomplish something.

    --
    My work here is dung.
    1. Re:Not This Again by AchilleTalon · · Score: 3, Insightful

      Anyway, this language isn't for you. That's a specialized language for numerically intensive applications and if you have a clue about what it is, you would found this language very easy to learn. It is almost the same as all the tools (Matlab/Octave) currently used in these fields and for teaching. You aren't expected to write a Web browser in Julia.

      The performance/ease of use is the appropriate balance in this field. Usable in almost no time.

      --
      Achille Talon
      Hop!
  3. FORTRAN? by TWX · · Score: 5, Interesting

    From wikipedia: "FORTRAN is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing." Sounds to me like unless there's a particular weakness in FORTRAN that doesn't lend itself to workarounds or repair in newer versions of the language, there's already a numeric computation and scientific programming language that's well documented, mature, and widely distributed.

    --
    Do not look into laser with remaining eye.
    1. Re:FORTRAN? by Anonymous Coward · · Score: 5, Interesting

      Second that. Modern FORTRAN kicks some serious butt and has a huge user and support base. Language snobs dismiss it as antiquated but they're usually referring to versions of the language that haven't been used since the 1980's. There are good reasons that current HPC developers use mostly FORTRAN and C, like good support for parallalization, global memory functions for clustering, and efficient compilers.

      It's great that people make new tools and share them with others, but many times that effort could be put into making existing good tools even better.

    2. Re:FORTRAN? by Anonymous Coward · · Score: 3, Insightful

      From wikipedia: "FORTRAN is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing."

      Sounds to me like unless there's a particular weakness in FORTRAN that doesn't lend itself to workarounds or repair in newer versions of the language, there's already a numeric computation and scientific programming language that's well documented, mature, and widely distributed.

      Yes, you have to be concerned when someone talks about numerical computing and mentions C but skips FORTRAN.

      The GNU FORTRAN compiler is quite good (and free), but the $$$ compilers (such as the Intel FORTRAN compiler) are needed to get the speed you need to outperform other languages. Simply put, it costs serious money to do high-end professional FORTRAN development - something that is hard to take if you are coming from a background where compilers are free.

    3. Re:FORTRAN? by Anonymous Coward · · Score: 3, Insightful

      People use FORTRAN largely because it's tried and tested. We know how to code it, we know the quirks, we know how to cope with rounding issues and so forth. Similarly C is a powerhouse because fundamentally it's a simple language and it's very, very well understood. That counts a lot more for speed, in many cases. Nowadays with vast parallelisation becoming cheap, it is generally better to stick with code that you know works and trade off individual thread efficiency for strength in numbers.

      Note there's a subtle difference between people using legacy code because they're lazy and the infrastructure is there and people using legacy code because they know it works, even if there is a potentially better alternative.

      Air traffic control in the UK still runs on computers using Pentium hardware and probably won't change for at least another 10 years because when you're developing critical systems you need to know that what you're doing is robust.

    4. Re:FORTRAN? by Bill_the_Engineer · · Score: 3, Informative

      Yes and is used extensively. There are even C libraries that include FORTRAN code to numerical work quickly.

      --
      These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
    5. Re:FORTRAN? by ahoffer0 · · Score: 4, Interesting

      I attended SC11 (sc11.supercomputing.org) last year. FORTRAN is still the work horse of (large-scale) numerical computing. C/C++ are popular. So are MATLAB and R. They was even a NumPy tutorial and some sessions on emerging languages like Chapel. But FORTRAN was king.

      I thought this was an interesting thread about FORTRAN v. C -- http://www.physicsforums.com/showthread.php?t=169974

      Off-topic:When it came to programming, the general drift of the conference was not toward new languages, but toward adding meta-information, vis-a-vi compiler directives.

    6. Re:FORTRAN? by gl4ss · · Score: 4, Informative

      well.. apparently this "language" has modern fortran built in.

      just open the link. there's some stats there. I don't envision huge popularity for this though.. unless he integrates/develops it into a fullblown mathlab competitor(that javascript does mandelbrot almost as fast is kind if peculiar though..). not as peculiar as "pi sum" bench being faster on javascript and julia than c++.

      "The library, mostly written in Julia itself, also integrates mature, best-of-breed C and Fortran libraries for linear algebra, random number generation, FFTs, and string processing. "

      --
      world was created 5 seconds before this post as it is.
    7. Re:FORTRAN? by plopez · · Score: 4, Informative

      Ummmm..... Fortran has supported free form format since ummmm..... 1993. All modern Fortran compilers I know of support both the old and the new formats. Cast off your old prejudices and learn what a modern programming language Fortran '08 really is.

      BTW, I believe using the "FORTRAN" has been deprecated since the 70's. It is now "Fortran".

      --
      putting the 'B' in LGBTQ+
    8. Re:FORTRAN? by MeanGene · · Score: 3, Informative

      Hey - at least Fortran does not rely on whitespace!

  4. Re:The "C" for some field? by BenoitRen · · Score: 3, Insightful

    As well as maintainability, readability, etc.

    You're funny. Both of these depend largely on the programmer writing the code.

  5. It's the packages stupid! by Hatta · · Score: 4, Interesting

    What will make or break this language is the availability of addon packages for it. A lot of people who use R don't do much coding themselves. They read in data, preprocess it a little bit, and then apply one of the packages found in CRAN.

    CRAN is like CPAN, but for R instead of Perl. And we can expect similar behavior from them. Perl probably wouldn't be anyone's first choice for a project these days, but the size and scope of CPAN makes it really really easy to benefit from the work of others. This is a lot of inertia, and a big reason why Perl is still used when newer languages have significant advantages.

    There's so much software, particularly academic software, implemented in R that I just don't see it going away. e.g. the entire Bioconductor suite is implemented in R. Just about any bioinformatics paper you pick at random will refer to, if not contain R code.

    How much work are we going to have to reimplement if we want everyone to use the one true numerical programming language? And if we don't want that, isn't it just contributing to fragmentation?

    --
    Give me Classic Slashdot or give me death!
    1. Re:It's the packages stupid! by danfromsb · · Score: 3, Insightful

      Absolutely right. It is important to recognize that both Matlab and R are much more than just languages. I would also throw Mathematica into the mix too, while it is a bit slower than Matlab, its numerical capabilities have continued to grow and it incorporates a fine statistics package alongside a quality plotting and graphics package (not to mention its symbolic roots and recent introduction of dynamic gui manipulation).

      For julia to be successful it needs robust integration with quality addon packages, starting with graphics and plotting. It also needs good documentation. One thing that annoys me to no end with Python (and numpy, scipy, pylab, matplotlib) is that you have to look at 3 or 4 different websites to look up API and examples. In my mind Mathematica does this right: a single documentation library which incorporates API reference, tutorials, and common functions grouped together. At the bottom of every page it lists related functions and tutorials so it is easy to discover new API calls in the language.

  6. Numerical Python by Dr.+Tom · · Score: 5, Informative

    Robust, mature, fast, easy to use, side-by-side with .m it wins hands down, really no comparison, use Python.
    Cython if you need to make it faster for the %5 of code that is too slow.

    import numpy
    import pylab

  7. Re:The "C" for some field? by Lunix+Nutcase · · Score: 4, Insightful

    If you aren't writing readable and maintainable code in C that's the programmer's fault. You aren't forced to use obscure abbreviations, bizarre inline hacks, etc. There is plenty of readable and well-maintained code that has been running non-stop longer than your modern langauges have existed.

  8. Re:Version 3f670da0 by ifrag · · Score: 4, Funny

    What is that a hash of the source code?

    Careful... wouldn't want to give the Mozilla devs any ideas.

    --
    Fear is the mind killer.
  9. APL? by crow · · Score: 3, Insightful

    The other obvious language to come to mind is APL. Anyone looking to write a numerical processing language should have some APL experience.

    Yes, it is a pain to learn all the symbols. Programs are incredibly dense, making them difficult to understand and debug, but there are also a lot of cool things you can do with the language. In building a new language, there's a lot of good stuff there to incorporate.

  10. Re:The "C" for some field? by arth1 · · Score: 3, Informative

    Never seen a successful language named after a person. Probably never will.

    There aren't that many, and the few there are seems to have been fairly successful. At the top of my head, I can think of Pascal, Ada Eiffel, Haskell and Ruby.

    At least Pascal had a huge impact. p-code was the frontrunner for bytecode. I'd say it dwindled because of Borland who played fast and loose with the standards, reducing its main strength of being incredibly strict and compatible for a visual IDE and proprietary extensions (Delphi). This gave it a short term flare, but probably helped kill it in the long run.

  11. Re:Fortress by dougmc · · Score: 4, Informative

    The weakness of FORTRAN is that it entirely misses out of 50+ years of research and innovation in programming languages.

    OK, maybe the original version of Fortran, the one made 50+ years ago, missed out on "50+ years of research and innovation in programming languages", but you are aware that Fortran has been updated since then, right?

    Fortran now includes a great number of the improvements to programming languages made since then. But don't take my word for it -- check out Wikipedia's page on it. I picked Fortran 90 as a starting point, but there's been many versions of Fortran made since the first, with new features (often coming from other languages) being added all the time.

    And not only is Fortran still being actively developed, but the library of well tested and optimized numerical computing code already written it it is massive.

    I'm not saying that there's not room for a new language, and certainly, Fortran doesn't have all the features of some new languages, but your claim that Fortran "entirely misses out of 50+ years of research and innovation in programming languages" is completely and utterly wrong.

    I should also mention that they stopped calling it FORTRAN in all caps back in 1990 or so when Fortran 90 came out. Now it's just Fortran. But even the venerable FORTRAN 77 benefited greatly from programing language developments available at the time.

  12. Re:The "C" for some field? by jythie · · Score: 3, Interesting

    It could be argued that Ada not only had a huge effect (many language features we now thing of as standard came from Ada), but is still in use in many places, so I would call it pretty successful.

  13. Re:The "C" for some field? by swan5566 · · Score: 5, Insightful

    The problem is that a lot of researchers and scientists who write these things aren't trained in good programming practices, and most of the good programmers don't have the background to do a lot of the advanced math stuff properly.

    --
    In debates about Christianity, there are two groups: those looking for answers, and those looking to just ask questions.
  14. Re:The "C" for some field? by Vegemeister · · Score: 3, Insightful

    The problem with C++ is that it has too many features, and too many ways to do the same thing. You can write complex application programs in C++ while only knowing a small subset of the language. The problem ocurs when someone comes along to maintain your code and knows a completely different subset.

  15. Not fast at all by loufoque · · Score: 3, Informative

    They want to design a language for speed, but they already made choices in the language that hamper speed dramatically, like dynamic typing. Dynamic typing adds overhead to every function call; it's fine if your functions do a lot of work, not so much if they do relatively little and are called very often.
    It looks like if you want to write fairly low-level code, you'll still need to write it in C there...

    It also looks like their approach to parallelization is very heavy-weight and, albeit usable in clusters, it will yield both poor scalability on large systems and poor performance on simple multi-core systems.

    There is already a high-level, dynamic and accessible language for numerical computing, it's MATLAB. It wraps a lot of high-performance libraries, using them without the user even noticing it. Code in MATLAB can easily be faster than in C for some constructs because C compilers, unlike MATLAB, do not recognize some patterns and replace them by optimized library calls. For this reason, MATLAB is great when you're coding with high-level constructs, but suffers from poor performance when using low-level constructs (such as accessing data element by element) for the same reasons as pointed out above.

    A new language for high-performance numerical computing should allow both the high-level programming of MATLAB and the possibilities of a low-level statically compiled language like C. The best contender for this is C++, which has tons of high-level and fast libraries for transcendental functions, linear algebra, statistics, image processing, signal processing, etc.

    As for FORTRAN, it's great for writing one thing well and fast, but it doesn't have any mechanisms for more high-level programming or code re-use, which means it is annoying to maintain, extend, or to even guarantee consistencies between the different subroutines of a large application. It also relies a lot more on what the compiler will do, while with C/C++ there is more control on what happens with regards to vectorization, parallelization or data transfers, which can be critical for heterogeneous systems.

    1. Re:Not fast at all by Baron+von+Leezard · · Score: 3, Insightful

      Dynamic typing doesn't add any overhead when you can determine which specific method you need when generating code — which, in a dynamic language with a JIT, is very late, meaning that you can most of the time. Julia uses tons of small method definitions that call other small methods and so on, even for basic things like adding two integers, but the compiler is smart enough to compile addition into a single machine instruction. The notion that dynamic languages are slow because of their dynamism is very outdated in light of modern compiler techniques.

  16. Re:The "C" for some field? by jedwidz · · Score: 5, Funny

    Then again, most of it is written by biologists.

    You mean 'evolved by biologists'. They aren't strong believers in intelligent design.