Julia Language Seeks To Be the C For Numerical Computing
concealment writes in with an interview with a creator of the (fairly) new language Julia designed for number crunching. Quoting Infoworld: "InfoWorld: When you say technical computing, to what type of applications are you specifically referring? Karpinski: It's a broad category, but it's pretty much anything that involves a lot of number-crunching. In my own background, I've done a lot of linear algebra but a fair amount of statistics as well. The tool of choice for linear algebra tends to be Matlab. The tool of choice for statistics tends to be R, and I've used both of those a great deal. But they're not really interchangeable. If you want to do statistics in Matlab, it's frustrating. If you want to do linear algebra in R, it's frustrating. InfoWorld: So you developed Julia with the intent to make it easier to build technical applications? Karpinski: Yes. The idea is that it should be extremely high productivity. To that end, it's a dynamic language, so it's relatively easy to program, and it's got a very simple programming model. But it has extremely high performance, which cuts out [the need for] a third language [C], which is often [used] to get performance in any of these other languages. I should also mention NumPy, which is a contender for these areas. For Matlab, R, and NumPy, for all of these options, you need to at some point drop down into C to get performance. One of our goals explicitly is to have sufficiently good performance in Julia that you'd never have to drop down into C."
The language implementation is licensed under the GPL. Lambda the Ultimate has a bit of commentary on the language, and an R programmer gives his two cents on the language.
You mean, ignored by almost every developer in the field in lieu of more "business-friendly" languages that add bloat and inefficiency?
Why does the Oblig always misses the <a> tag? Even in Chrome, select + goto takes more time than a simple click.
Slashdot, fix the reply notifications... You won't get away with it...
I use Sage quite a bit. It's basically a wrapper for almost all the mathematics software available. http://www.sagemath.org/ While you still need to drop down to C for great performance, it solves a lot of the interoperability issues discussed. In other words, take the example from the summary: from Sage, you can call Matlab commands and then immediately use the results with R commands. Sage works through a web browser, and it's based on Python, which is a plus.
Three days from now?? Thats tomorrow!! ~Peter Griffin
In my opinion, the new code in Julia is easier to read than the R code because Julia has fewer syntactic quirks than R. More importantly, the Julia code runs much faster than the R code without any real effort put into speed optimization. For the sample text I tried to decipher, the Julia code completes 50,000 iterations of the sampler in 51 seconds, while the R code completes the same 50,000 iterations in 67 minutes — making the R code more than 75 slower than the Julia code.
That certainly caught my attention!
The XKCD comic you cite is correct for some standards but software languages are much more complex than standards and, in fact, many of them implement common sets of core standards. Once you get specific enough, you're not talking about a standard but rather a specific implementation of how to accomplish something.
My work here is dung.
What is that a hash of the source code?
From wikipedia: "FORTRAN is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing." Sounds to me like unless there's a particular weakness in FORTRAN that doesn't lend itself to workarounds or repair in newer versions of the language, there's already a numeric computation and scientific programming language that's well documented, mature, and widely distributed.
Do not look into laser with remaining eye.
What will make or break this language is the availability of addon packages for it. A lot of people who use R don't do much coding themselves. They read in data, preprocess it a little bit, and then apply one of the packages found in CRAN.
CRAN is like CPAN, but for R instead of Perl. And we can expect similar behavior from them. Perl probably wouldn't be anyone's first choice for a project these days, but the size and scope of CPAN makes it really really easy to benefit from the work of others. This is a lot of inertia, and a big reason why Perl is still used when newer languages have significant advantages.
There's so much software, particularly academic software, implemented in R that I just don't see it going away. e.g. the entire Bioconductor suite is implemented in R. Just about any bioinformatics paper you pick at random will refer to, if not contain R code.
How much work are we going to have to reimplement if we want everyone to use the one true numerical programming language? And if we don't want that, isn't it just contributing to fragmentation?
Give me Classic Slashdot or give me death!
How can they expect me to commit to this new language unless I have seen what kind of facial hair they have. I want huge beards on their front page.
Robust, mature, fast, easy to use, side-by-side with .m it wins hands down, really no comparison, use Python.
Cython if you need to make it faster for the %5 of code that is too slow.
import numpy
import pylab
Beard Ref
Also is this named after the same Julia that worked with fractals? Julia Ref
He effected a bored affect.
This may seem petty, but one of the biggest sources of relief to me in changing from Matlab to R and Numpy was finally leaving behind that damned operator syntax where element-wise operations need to have an extra dot prepended. That is to say, if I have an array t of times and an array x of distances, I want to be able to get the corresponding array of speeds using x / t. In Matlab and Julia I must instead use x ./ t.
It seems like no big deal, but it is unbelievable how many Matlab bugs I wrote due to that little difference. True linear algebraic operations are so rare, at least to me, that I am far happier giving them the special operators and reserving the usual operators to work element-wise.
I also must have named arguments and default values. It's a pity, because otherwise it looks to have decent syntax, good speed and nice parallelization. For now, I'm sticking with R, numpy and C.
Working in a tangential field, I've always felt one of the major choke points for doing numerical work in C or Java is the speed of development for programmers who don't strongly specialize in these languages already. While I understand this may be a niche, I'm curious (and perhaps someone can inform me) of the ease of development in Julia, as well as the speed of development. While this seems to be a main concern according to the summary, is this actually achieved, and if so, how?
Matlab and R give you a lot of power with a relatively small and simple command set. While they are both specialized to particular branches of mathematics and have less then optimal performance, they allow most anyone with mediocre programming knowledge to build sufficient programs.
Matlab is not a programming language, per se, but a numerical computing environment heavily geared towards linear algebra and its applications. You use the language to write scripts and interact with it. Having said that, I don't get the purpose behind this language and what it has to do with C. You don't "choose" Matlab over C. That makes no sense. The basic work flow is that one uses Matlab to simulate an algorithm/process. If it needs to be turned into a real product or something, then you can do it yourself in C or use Simulink, an accompanying package, to help. Does Julia or Matlab or R run on a signal processor? That is silly. The people behind Julia are obviously missing some basics. Maybe they should actually ask someone how Matlab is used in the real world.
The weakness of FORTRAN is that it entirely misses out of 50+ years of research and innovation in programming languages. My gripe with Julia is that it seems to be based on Common Lisp, which itself is pretty old at this point. Fortress seems like a better Fortran replacement to me, since it is actually based on modern functional programming languages. I mean, really, what's the point of releasing a new language based on outdated tech when better alternatives are available?
The other obvious language to come to mind is APL. Anyone looking to write a numerical processing language should have some APL experience.
Yes, it is a pain to learn all the symbols. Programs are incredibly dense, making them difficult to understand and debug, but there are also a lot of cool things you can do with the language. In building a new language, there's a lot of good stuff there to incorporate.
Forgot to mention the irony that one of the principal architects of Common Lisp was Guy Steele, who is now developing Fortress.
The Matlab Statistics Toolbox seems pretty good to me, though I don't use R, and I don't do a ton of statistics. Can anybody comment on what makes it frustrating (besides trying to use the output of the code to produce a publication-quality figure)?
He once inserted random mutations into his code, just so he could have the experience of debugging.
If you read the article, JavaScript is competitive with Julia in most of the Benchmarks.
So why yet another language.
It even looks vaguely like JavaScript, so why bother?
I'm not a lawyer, but I play one on the Internet. Blog
Exactly! Slashdot wastes enough of my time. Must I be forced to highlight, right click, and select an option from a menu? It's not like typing behind is a lot of work. See: http://xkcd.com/927/
When our name is on the back of your car, we're behind you all the way!
Anyone using or used lush here can compare?
Achille Talon
Hop!
"Yes, it is a pain to learn all the symbols."
And its impossible to enter a lot of them on most keyboards. That makes the language useless for almost everyone.
This looks like it might be a nice language for general-purpose use, too. It's got a nice blend of features borrowed from other languages such as Haskell-style data structures, Perl-style regular expressions, first-class functions, and of course powerful numerical manipulations. I might have to try it out next time I get fed up with Perl.
Visit the
Looks neat, I'd like to try it.
The website mentions how to call C functions from julia...Is there any way to do the oppisite? I'd like to try a julia library from a C program.
C : GeneralPurposeProgramming :: Fortran : Numerical Computing
The title of the post is off, or misleading, or ignorant.
The core Julia implementation uses the MIT License, not the GPL. See the license information here.
In other words, at the very least, the dynamic languages will have to do a table lookup and then call the function, as opposed to just calling the function.
Maybe someone should have a word with Intel/AMD. After all they seem to be running out of ideas on what to do with all those transistors.
The "CPP" file, for those interested: https://github.com/JuliaLang/julia/blob/master/test/perf/perf.cpp
-- A change is as good as a reboot.
Why does the Oblig always misses the <a> tag? Even in Chrome, select + goto takes more time than a simple click.
Because if don't know the xkcd strips by number, there's a card you might be expected to hand in as you leave.
PlusFive Slashdot reader for Android. Can post comments.
In other words, at the very least, the dynamic languages will have to do a table lookup and then call the function, as opposed to just calling the function.
.Net and the JVM also do that.
"(Did we mention it should be as fast as FORTRAN?)"
They want to design a language for speed, but they already made choices in the language that hamper speed dramatically, like dynamic typing. Dynamic typing adds overhead to every function call; it's fine if your functions do a lot of work, not so much if they do relatively little and are called very often.
It looks like if you want to write fairly low-level code, you'll still need to write it in C there...
It also looks like their approach to parallelization is very heavy-weight and, albeit usable in clusters, it will yield both poor scalability on large systems and poor performance on simple multi-core systems.
There is already a high-level, dynamic and accessible language for numerical computing, it's MATLAB. It wraps a lot of high-performance libraries, using them without the user even noticing it. Code in MATLAB can easily be faster than in C for some constructs because C compilers, unlike MATLAB, do not recognize some patterns and replace them by optimized library calls. For this reason, MATLAB is great when you're coding with high-level constructs, but suffers from poor performance when using low-level constructs (such as accessing data element by element) for the same reasons as pointed out above.
A new language for high-performance numerical computing should allow both the high-level programming of MATLAB and the possibilities of a low-level statically compiled language like C. The best contender for this is C++, which has tons of high-level and fast libraries for transcendental functions, linear algebra, statistics, image processing, signal processing, etc.
As for FORTRAN, it's great for writing one thing well and fast, but it doesn't have any mechanisms for more high-level programming or code re-use, which means it is annoying to maintain, extend, or to even guarantee consistencies between the different subroutines of a large application. It also relies a lot more on what the compiler will do, while with C/C++ there is more control on what happens with regards to vectorization, parallelization or data transfers, which can be critical for heterogeneous systems.
The C/C++ benchmarks are intentionally written in C; the only reason that's it's a C++ files instead of C is so that we can use C++'s complex template in the Mandelbrot benchmark. Otherwise the whole thing would just be done in C. The clock_now function is only used to time other code, so its performance is irrelevant.
Stefan Karpinski
The physical world, and the hardware in the computer, is stateful, not-stateless. There is a finite amount of storage, which can be overwritten.
The idiomatic programming model for functional language isn't like this.
In a functional language to ensure you get fast code, you have to both have a mental model of the program, and a much more complex mental model of the transformations that your functional compiler might (or might not!) apply. This is often exceptionally hard.
A human, like a numerical programmer, has some clever knowledge about how best to order and arrange things to map to an efficient implementation in a stateful world.
Take, for example, a production-level SVD algorithm. You could probably express a SVD method in a functional way. Would it be fast, and have low memory usage, no needless temporaries? (in high performance computing these always go together) Well maybe but you'd have to really massage things in light of a particular implementation's optimizer & quirks. That isn't something scientists have the desire to do.
In practice, the capability of imperative, but data-parallel languages best map to their user's knowledge and capabilities and existing technology for quality execution.
In other words, at the very least, the dynamic languages will have to do a table lookup and then call the function, as opposed to just calling the function.
.Net and the JVM also do that.
Not true. In Java, methods are virtual by default, so there is some truth there. However, the JIT in the JVM will frequently be able to infer the true actual type and inline or at least do away with the vtable lookup. In .Net, the default is non-virtual methods, but even when methods are virtual, the same facts about type inference hold. In fact, these are reasons for why rather OOP-heavy designs might perform faster in .Net/Java than in C++ (unless you compile your C++ with profile-guided optimizations) - so many possible inferences and optimizations are only possible with the runtime binding information.
Why don't you tell him yourself. He is here, posting comments. What great achievements have you made to call this person an idiot?
Fanboy Status: Apache Flex, C#, Eclipse, KDE, Pirate Party, Ron Paul, Slackware, Windows 7
I do not know how I read Julia Language as Julian Assange and thought why he was assiciated with C for numerical computing. I had to read the title again. Doh!
Senthil
But the biggest pluses of Matlab is being able to plot your results, and work interactively from the command line
This. Precisely this. If Julia can reasonably rapidly acquire plotting capabilities that approach those of R (and hence are much better than those of MATLAB) then it may have some traction. If it remains the case that you have to dump your results out of Julia and use another tool to plot them, then you are better off using R/MATLAB/Octave/IDL/SciLab/Yorick for everything except the most number-crunchingly intense tasks. And you would probably do _those_ in a traditional compiled language anyway.