Julia Language Seeks To Be the C For Numerical Computing
concealment writes in with an interview with a creator of the (fairly) new language Julia designed for number crunching. Quoting Infoworld: "InfoWorld: When you say technical computing, to what type of applications are you specifically referring? Karpinski: It's a broad category, but it's pretty much anything that involves a lot of number-crunching. In my own background, I've done a lot of linear algebra but a fair amount of statistics as well. The tool of choice for linear algebra tends to be Matlab. The tool of choice for statistics tends to be R, and I've used both of those a great deal. But they're not really interchangeable. If you want to do statistics in Matlab, it's frustrating. If you want to do linear algebra in R, it's frustrating. InfoWorld: So you developed Julia with the intent to make it easier to build technical applications? Karpinski: Yes. The idea is that it should be extremely high productivity. To that end, it's a dynamic language, so it's relatively easy to program, and it's got a very simple programming model. But it has extremely high performance, which cuts out [the need for] a third language [C], which is often [used] to get performance in any of these other languages. I should also mention NumPy, which is a contender for these areas. For Matlab, R, and NumPy, for all of these options, you need to at some point drop down into C to get performance. One of our goals explicitly is to have sufficiently good performance in Julia that you'd never have to drop down into C."
The language implementation is licensed under the GPL. Lambda the Ultimate has a bit of commentary on the language, and an R programmer gives his two cents on the language.
You mean, ignored by almost every developer in the field in lieu of more "business-friendly" languages that add bloat and inefficiency?
Why does the Oblig always misses the <a> tag? Even in Chrome, select + goto takes more time than a simple click.
Slashdot, fix the reply notifications... You won't get away with it...
I use Sage quite a bit. It's basically a wrapper for almost all the mathematics software available. http://www.sagemath.org/ While you still need to drop down to C for great performance, it solves a lot of the interoperability issues discussed. In other words, take the example from the summary: from Sage, you can call Matlab commands and then immediately use the results with R commands. Sage works through a web browser, and it's based on Python, which is a plus.
Three days from now?? Thats tomorrow!! ~Peter Griffin
In my opinion, the new code in Julia is easier to read than the R code because Julia has fewer syntactic quirks than R. More importantly, the Julia code runs much faster than the R code without any real effort put into speed optimization. For the sample text I tried to decipher, the Julia code completes 50,000 iterations of the sampler in 51 seconds, while the R code completes the same 50,000 iterations in 67 minutes — making the R code more than 75 slower than the Julia code.
That certainly caught my attention!
The XKCD comic you cite is correct for some standards but software languages are much more complex than standards and, in fact, many of them implement common sets of core standards. Once you get specific enough, you're not talking about a standard but rather a specific implementation of how to accomplish something.
My work here is dung.
What is that a hash of the source code?
From wikipedia: "FORTRAN is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing." Sounds to me like unless there's a particular weakness in FORTRAN that doesn't lend itself to workarounds or repair in newer versions of the language, there's already a numeric computation and scientific programming language that's well documented, mature, and widely distributed.
Do not look into laser with remaining eye.
As a biologist that works with R daily for the last 12 years (I think the first R version I used was 0.65 and used S-Plus before that), this seems a dream come true. I'm fed up with using unpractical and ugly languages such as C and Java whenever R is too slow for the job, which in my case is quite frequently.
What will make or break this language is the availability of addon packages for it. A lot of people who use R don't do much coding themselves. They read in data, preprocess it a little bit, and then apply one of the packages found in CRAN.
CRAN is like CPAN, but for R instead of Perl. And we can expect similar behavior from them. Perl probably wouldn't be anyone's first choice for a project these days, but the size and scope of CPAN makes it really really easy to benefit from the work of others. This is a lot of inertia, and a big reason why Perl is still used when newer languages have significant advantages.
There's so much software, particularly academic software, implemented in R that I just don't see it going away. e.g. the entire Bioconductor suite is implemented in R. Just about any bioinformatics paper you pick at random will refer to, if not contain R code.
How much work are we going to have to reimplement if we want everyone to use the one true numerical programming language? And if we don't want that, isn't it just contributing to fragmentation?
Give me Classic Slashdot or give me death!
How can they expect me to commit to this new language unless I have seen what kind of facial hair they have. I want huge beards on their front page.
Robust, mature, fast, easy to use, side-by-side with .m it wins hands down, really no comparison, use Python.
Cython if you need to make it faster for the %5 of code that is too slow.
import numpy
import pylab
Beard Ref
Also is this named after the same Julia that worked with fractals? Julia Ref
He effected a bored affect.
This may seem petty, but one of the biggest sources of relief to me in changing from Matlab to R and Numpy was finally leaving behind that damned operator syntax where element-wise operations need to have an extra dot prepended. That is to say, if I have an array t of times and an array x of distances, I want to be able to get the corresponding array of speeds using x / t. In Matlab and Julia I must instead use x ./ t.
It seems like no big deal, but it is unbelievable how many Matlab bugs I wrote due to that little difference. True linear algebraic operations are so rare, at least to me, that I am far happier giving them the special operators and reserving the usual operators to work element-wise.
I also must have named arguments and default values. It's a pity, because otherwise it looks to have decent syntax, good speed and nice parallelization. For now, I'm sticking with R, numpy and C.
Working in a tangential field, I've always felt one of the major choke points for doing numerical work in C or Java is the speed of development for programmers who don't strongly specialize in these languages already. While I understand this may be a niche, I'm curious (and perhaps someone can inform me) of the ease of development in Julia, as well as the speed of development. While this seems to be a main concern according to the summary, is this actually achieved, and if so, how?
Matlab and R give you a lot of power with a relatively small and simple command set. While they are both specialized to particular branches of mathematics and have less then optimal performance, they allow most anyone with mediocre programming knowledge to build sufficient programs.
Matlab is not a programming language, per se, but a numerical computing environment heavily geared towards linear algebra and its applications. You use the language to write scripts and interact with it. Having said that, I don't get the purpose behind this language and what it has to do with C. You don't "choose" Matlab over C. That makes no sense. The basic work flow is that one uses Matlab to simulate an algorithm/process. If it needs to be turned into a real product or something, then you can do it yourself in C or use Simulink, an accompanying package, to help. Does Julia or Matlab or R run on a signal processor? That is silly. The people behind Julia are obviously missing some basics. Maybe they should actually ask someone how Matlab is used in the real world.
The weakness of FORTRAN is that it entirely misses out of 50+ years of research and innovation in programming languages. My gripe with Julia is that it seems to be based on Common Lisp, which itself is pretty old at this point. Fortress seems like a better Fortran replacement to me, since it is actually based on modern functional programming languages. I mean, really, what's the point of releasing a new language based on outdated tech when better alternatives are available?
The other obvious language to come to mind is APL. Anyone looking to write a numerical processing language should have some APL experience.
Yes, it is a pain to learn all the symbols. Programs are incredibly dense, making them difficult to understand and debug, but there are also a lot of cool things you can do with the language. In building a new language, there's a lot of good stuff there to incorporate.
Forgot to mention the irony that one of the principal architects of Common Lisp was Guy Steele, who is now developing Fortress.
The Matlab Statistics Toolbox seems pretty good to me, though I don't use R, and I don't do a ton of statistics. Can anybody comment on what makes it frustrating (besides trying to use the output of the code to produce a publication-quality figure)?
He once inserted random mutations into his code, just so he could have the experience of debugging.
If you read the article, JavaScript is competitive with Julia in most of the Benchmarks.
So why yet another language.
It even looks vaguely like JavaScript, so why bother?
I'm not a lawyer, but I play one on the Internet. Blog
Exactly! Slashdot wastes enough of my time. Must I be forced to highlight, right click, and select an option from a menu? It's not like typing behind is a lot of work. See: http://xkcd.com/927/
When our name is on the back of your car, we're behind you all the way!
Someone, tell the idiot which created "julia", about fortrant. We've been using it for decades, for "Number Crunching".
Anyone using or used lush here can compare?
Achille Talon
Hop!
"Yes, it is a pain to learn all the symbols."
And its impossible to enter a lot of them on most keyboards. That makes the language useless for almost everyone.
This looks like it might be a nice language for general-purpose use, too. It's got a nice blend of features borrowed from other languages such as Haskell-style data structures, Perl-style regular expressions, first-class functions, and of course powerful numerical manipulations. I might have to try it out next time I get fed up with Perl.
Visit the
Looks neat, I'd like to try it.
The website mentions how to call C functions from julia...Is there any way to do the oppisite? I'd like to try a julia library from a C program.
C : GeneralPurposeProgramming :: Fortran : Numerical Computing
The title of the post is off, or misleading, or ignorant.
AFAIK Ruby will always be slower than C or C++ because you have to be able to dynamicly modify classes. However, if you don't dynamicly modify any classes in your program than it seems like the vtable can get pulled into cache and you'll be OK. The real problem is that you have to call all functions through a vtable (or some equivalent) in the first place. Plain old C (or C++ without any virtual functions which is kind of boring) doesn't have this problem.
In other words, at the very least, the dynamic languages will have to do a table lookup and then call the function, as opposed to just calling the function.
If the language gives you the ability to mark up functions with some kind of optimization keyword (like C's restrict and const) then maybe they'll have something. Of course, what kind of weird runtime problems might they have with a dynamic language where *some* of the functions are nailed down and others are all loosy-goosy?
The core Julia implementation uses the MIT License, not the GPL. See the license information here.
I couldn't find a reference for formatting comments on Slashdot after an initial search.
They all fail pretty horribly.
The "C++" listing they provide for their benchmarks is written in C, not C++.
That and it appears designed to reduce the performance of the C/C++ code while increasing the complexity. E.g.
double clock_now()
{
struct timeval now;
gettimeofday(&now, NULL);
return (double)now.tv_sec + (double)now.tv_usec/1.0e6;
}
Because converting integer values to doubles won't hurt, right?
Why does the Oblig always misses the <a> tag? Even in Chrome, select + goto takes more time than a simple click.
Because if don't know the xkcd strips by number, there's a card you might be expected to hand in as you leave.
PlusFive Slashdot reader for Android. Can post comments.
"(Did we mention it should be as fast as FORTRAN?)"
As an economics grad student, this sort of thing would be useful. Most of us already know Matlab, so the similarities in syntax are a plus. On the other hand, Matlab is slow for some things, and vectorizing using bsxfun and matrix tricks makes code hard to read.
But the biggest pluses of Matlab is being able to plot your results, and work interactively from the command line. It looks like Julia has a command line but I couldn't find any reference to plotting tools.
For people who recommend C, the problem is that there are fixed costs associated with working in C that are only justified for a professional programmer (i.e. the majority of commentators here). These fixed costs include
1) Learning syntax
2) Learning to use whatever IDE you use
3) Learning to use whatever numerical/statistical libraries you need (if they exist)
4) Linking/building libraries (this can be extremely time consuming unless you are very experienced or the libraries are extremely common)
and this doesn't even touch on how you would go about replicating the functionality of an interactive command line, or plotting tools.
These costs are high enough that most professors and phd students prefer to buy better computing hardware than to leave Matlab.
They want to design a language for speed, but they already made choices in the language that hamper speed dramatically, like dynamic typing. Dynamic typing adds overhead to every function call; it's fine if your functions do a lot of work, not so much if they do relatively little and are called very often.
It looks like if you want to write fairly low-level code, you'll still need to write it in C there...
It also looks like their approach to parallelization is very heavy-weight and, albeit usable in clusters, it will yield both poor scalability on large systems and poor performance on simple multi-core systems.
There is already a high-level, dynamic and accessible language for numerical computing, it's MATLAB. It wraps a lot of high-performance libraries, using them without the user even noticing it. Code in MATLAB can easily be faster than in C for some constructs because C compilers, unlike MATLAB, do not recognize some patterns and replace them by optimized library calls. For this reason, MATLAB is great when you're coding with high-level constructs, but suffers from poor performance when using low-level constructs (such as accessing data element by element) for the same reasons as pointed out above.
A new language for high-performance numerical computing should allow both the high-level programming of MATLAB and the possibilities of a low-level statically compiled language like C. The best contender for this is C++, which has tons of high-level and fast libraries for transcendental functions, linear algebra, statistics, image processing, signal processing, etc.
As for FORTRAN, it's great for writing one thing well and fast, but it doesn't have any mechanisms for more high-level programming or code re-use, which means it is annoying to maintain, extend, or to even guarantee consistencies between the different subroutines of a large application. It also relies a lot more on what the compiler will do, while with C/C++ there is more control on what happens with regards to vectorization, parallelization or data transfers, which can be critical for heterogeneous systems.
c++ is fast, but doesn't have nice slicing like python and fortran.
Fortran is fast and has slicing, but it is a little too clumsy to be useful for large programs with polymorphism and pointers.
Python has polymorphism and a nice syntax, but is is dynamic, and that makes nut slow at certain tasks.
I would love to have a compiled language with python-style slicing and a modern syntax, and Julia seems to deliver.
The physical world, and the hardware in the computer, is stateful, not-stateless. There is a finite amount of storage, which can be overwritten.
The idiomatic programming model for functional language isn't like this.
In a functional language to ensure you get fast code, you have to both have a mental model of the program, and a much more complex mental model of the transformations that your functional compiler might (or might not!) apply. This is often exceptionally hard.
A human, like a numerical programmer, has some clever knowledge about how best to order and arrange things to map to an efficient implementation in a stateful world.
Take, for example, a production-level SVD algorithm. You could probably express a SVD method in a functional way. Would it be fast, and have low memory usage, no needless temporaries? (in high performance computing these always go together) Well maybe but you'd have to really massage things in light of a particular implementation's optimizer & quirks. That isn't something scientists have the desire to do.
In practice, the capability of imperative, but data-parallel languages best map to their user's knowledge and capabilities and existing technology for quality execution.
I do not know how I read Julia Language as Julian Assange and thought why he was assiciated with C for numerical computing. I had to read the title again. Doh!
Senthil
"The really, really short answer is that you should not. The somewhat longer answer is that just because you are capable of building a bikeshed does not mean you should stop others from building one just because you do not like the color they plan to paint it. This is a metaphor indicating that you need not argue about every little feature just because you know enough to do so. Some people have commented that the amount of noise generated by a change is inversely proportional to the complexity of the change."
http://bikeshed.com/
Well, yes and no. A C-programmer ought to be able (with some effort) to write C code that's readable and maintainable to *another* C-programmer.
The resulting code however will often *not* be readable, let alone maintainable, to a anyone but a C-programmer. Someone unfamiliar with C might believe they can read the code and understand what it code does, but they won't really. This becomes painfully obvious at least as soon as when someone who isn't a C-programmer tries to modify a C program. For example a scientist, or a graduate student. The learning curve for C is a lot steeper than the one for e.g. Fortran or Basic.
This is the main reason why scientists use Fortran rather than C or C++: you can take the code "at face value", and small modifications will likely work.
Doing nonlinear optimization assignments right now and I am longing for the oasis that is C compared to the chaos and slugishness of MATLAB. Also octave-symbolic just doesn't like me. :'(
Who is this Julia, and how do I meet her?