Numerical Computing in Java?

← Back to Stories (view on slashdot.org)

Posted by Cliff on Monday September 20, 2004 @07:35AM from the calculating-coffee dept.

Nightshade queries: "I work for a department in a big financial company that uses equal amounts of C++ and Java. For a variety of reasons, we've decided that Java is the future of the group because of all the benefits of the language (it's so easy to use compared to C++, we can use Eclipse, Ant, jUnit, etc). The problem is that we do a lot of numerical computing and Java has no operator overloading! Languages like C# have operator overloading and because of this company's like CenterSpace have popped up with some nice looking numerical libraries. Try to find numerical packages for Java and it'll be pretty tough. What have people done in terms of numerical computing in Java? We currently use the Jama and Colt libraries for matrices and complex numbers, but these have awkward interfaces without operator overloading and are incomplete (no support for things like symmetric matrices) so we're looking for better solutions. So should we bite the bullet and switch to C#? Should we use a pre-processor like JFront? What have other people done?"

34 of 196 comments (clear)

Min score:

Reason:

Sort:

No Operator Overloading is a BAD THING by digerata · 2004-09-20 07:45 · Score: 4, Insightful

I always thought that Sun's decision to leave operator overloading out of java was a huge mistake. IIRC, Their argument being that it could lead to confusing code if programmers change the meaning of operators like + is really -. If you ask me that argument is ridiculous. A programmer could just as easily create a method called add() and have it perform like subtract.
All it does is make us have to type more and take a few hundred milliseconds more time to look at a line of code like
CrazyObjectNumber a = new CrazyObjectNumber(...);
CrazyObjectNumber b = new CrazyObjectNumber(...);
CrazyObjectNumber c = (a * b) + 53;
Which line 3 ends up being:
CrazyObjectNumber c = ((a = a.multiply(b)).add(53)).clone();
Which one is easier to read?

--

1;
1. Re:No Operator Overloading is a BAD THING by cft_128 · 2004-09-20 11:58 · Score: 4, Funny
  
  On top of that, someone could come along and change the code and forget to update the comment to reflect the change. Then you simply have more obfuscated code.
  And that is why I never, ever comment my code.
  
  --
  Underloved Movies and Pub Quiz: donotquestionme.org
2. Re:No Operator Overloading is a BAD THING by TheLink · 2004-09-20 13:40 · Score: 2, Insightful
  
  Think of it as something like CRC. If the code isn't consistent with the comments then you know you're supposed to fix something. Either the comment is broken or the code is.
  
  Sure it may not be easy to figure out which is broken but that's better than figuring whether the 10 lines of Java are correct or not (and which 10 lines to focus on) (if they're wrong you don't have a quick "checksum").
  --
  
  Too many replies beneath your current threshold
3. Re:No Operator Overloading is a BAD THING by gnovos · 2004-09-21 13:12 · Score: 2, Insightful
  
  On top of that, someone could come along and change the code and forget to update the comment to reflect the change. Then you simply have more obfuscated code.
  
  And that is why I never, ever comment my code.
  
  That was moderated as funny, but in reality it is an excellent idea. People still have to understand what your code does, so if you write you code in such a way that it is perfectly clear what you are doing, and with variable, class and method names that clearly indicate thier function, there is absolutly no reason to comment code and every reason not to.
  
  Comments can very easily grow out of date but the code itself NEVER can. That is the nature of code, after all.
  
  --
  "Your superior intellect is no match for our puny weapons!"
Use the right tools for the right job... by Chilles · 2004-09-20 07:51 · Score: 4, Insightful

Sure it might be easier on the administration side to use just one tool. But in the end a language is just that, a tool. You don't see carpenters throwing away all their tools except the hammer just to keep their tool-shed clean...
I hate overloading by Anonymous Coward · 2004-09-20 07:56 · Score: 3, Insightful

I hate operator overloading because if hides what's actually happening - a function call. When you are actually debugging code its difficult to see what is going on.

I also dislike "virtual" inheritance for the same reason.

I just don't think OO programming is the greatest thing since sliced bread. That's a very unpopular view.
Java Hurt by Anonymous Coward · 2004-09-20 07:58 · Score: 3, Interesting

Check out the writings of Dr William Kahan. One of the men behind the IEEE floating-point standard.
Read "How JAVA's Floating-Point Hurts Everyone Everywhere".

http://www.cs.berkeley.edu/~wkahan/

For speed, Fortran is still best. Most enginering codes are in Fortran.
1. Re:Java Hurt by Too+Much+Noise · 2004-09-20 08:30 · Score: 3, Interesting
  
  For speed, Fortran is still best. Most enginering codes are in Fortran.
  
  That does not compute, logically - erm ... maybe only if you meant development speed (not arguing program speed one way or another).
  
  Anyway, from the very paper you pointed to, C9x does complex math better than Fortran. Interesting - I wish there were some detail to it though.
These are not the languages you are looking for by Anonymous Coward · 2004-09-20 08:00 · Score: 5, Insightful

Step away from the "one language fits all" mentality. The type of problem you're trying to solve has already been solved, so you can forget about Java and C++.

Go get Matlab (or Mathematica or Mathcad/Maple). Matlab has a powerful scripting language that does exactly what you need, and you can download thousands of functions written for it. Or just hire me and I'll write a translator from Matlab to your favorite language. Oh wait: translators already exist, so nevermind.

Also, why are you trying to confuse yourself (and future maintainers) with operator overloading in C++? It's just a Bad Idea (TM). Don't do it.
1. Re:These are not the languages you are looking for by 4of12 · 2004-09-20 10:05 · Score: 3, Interesting
  
  One step further along that road: consider using Python to glue together old pieces.
  
  If Java was a step toward elegant simple expression away from C++, the Python is yet another wonderful step in that direction. The URL is for Bruce Eckel's site: he of the Thinking in {C++,Java} book series fame.
  
  You can glue together all those highly efficient numerical kernels written in FORTRAN, C or C++ with a nice object oriented scripting language. No need to go through more off-road stress testing of a new Java implementation of SomeOldAlgorithm with all the quirky corner cases that people have already hit using the crust old code in languages no one wants to learn anymore.
  
  --
  "Provided by the management for your protection."
2. Re:These are not the languages you are looking for by Hast · 2004-09-21 07:53 · Score: 2, Interesting
  
  Matlab m-files can be transformed into both Fortran or C and compiled if you need more speed. Naturally it is also possible to use compiled libraries directly from Matlab.
  
  That said Matlabs main strength is the near infinite libraries (generally all numerical mathematical and engineering research is done in Matlab, so basically anything you can ever want is available) and prototyping speed. The m-files are interpreted at runtime and you can also "code" in interactive mode much like with Python (but Matlab is more complete in this area than the Python interactive stuff).
  
  Matlab isn't for realtime stuff though (which I assume is what you mean, because if you are just simulating ms intervals aren't that bad). Typically you'd use Matlab to prototype and then either translate to C/Fortran or just redo based on the Matlab files. Since Matlab is a numerical system it's easy to make statistical checks on your data in order to ensure correctness. And that should be a major part of the development phase, I have tried to "just do the code first" a number of times and it seldom saves you time.
Jython by FullMetalAlchemist · 2004-09-20 08:07 · Score: 5, Interesting

You might want to try Jython and the Numerical Python for Jython.
I have not used either for a long time, but use plain Python and Numerical Python a lot; sure beats Matlab and Mathematica for most things. Right now for solving optimization problems with 10k+ s.t. constraints.
1. Re:Jython by anonymous+cowherd+(m · 2004-09-20 09:33 · Score: 2, Insightful
  
  Why use Jython when you could use regular Python? The only advantage Jython has (and I'll admit, this might be a big advantage depending on the size of the codebase) is that it compiles to Java bytecodes and allows you to access Java classes from Python.
  Other than that, I can't think of any reason why I'd use it, especially for numeric computation. Why be 2 versions behind in the language when you can have some very useful and elegant features like generator expressions now?
  
  --
  http://neokosmos.blogsome.com
Do BOTH! by cwensley · 2004-09-20 08:09 · Score: 2, Interesting

What I'd suggest is BOTH. I am a huge C# fan, and am very inexperienced when it comes to Java. I never got into Java because the code seemed very akward and cumbersome to me (event handling, etc).

The dev teams working with java are used to the quirks of the language, thus should be very familiar with how to use the library, even though it might not be the best it could be.

However, If you are looking to provide a tool for companies to use for development, I would recommend both. There is a need for this in both Java AND .NET, since each dev house will use their platform of choice. If you write a .NET version, then it can be used by any lanugage supported by .NET (C#,VB,C++,Java.NET,Cobol,Python, etc). But it will not be usable from native Java.

Perhaps looking into a way to use the same code base, but compile it in both Java, and Java.NET would be the best option here, and give more choice to the customers!
Re:Can you elaborate? by Anonymous Coward · 2004-09-20 09:17 · Score: 4, Insightful

how about
a = sqrt(abs((b + c) * (d / e)));
vs
a = Math.sqrt(Math.abs((b.add(c).multiply(d.divide(e)) ));

for the small cases (such as this one) it doesn't make as much difference, but for complex equations it adds up quickly.
Pathetic by kaffiene · 2004-09-20 09:47 · Score: 2

That's really sad. You have all the functionality you want but can't progress because your favourite syntactical sugar isn't there?

That's pathetic.

That's like saying, I've got this *great* idea but I can't code it up because I'm using C and it has brackets and not "BEGIN ... END" and I just can't live without them.

It's a different language - get used to it.
multiple options by BeerMilkshake · 2004-09-20 09:49 · Score: 2, Informative

1. try to do everything in one environment

This seems like low short-term risk because you reduce the number of technologies that have to work together, but you incur more long-term risk because of technology churn.

2. Combine libraries

A library implemented in Java/.NET can call a library implemented in .NET/Java using bytecode-IL translators such as IKVM.

Another way is to develop bindings, like we used to do to call C++ libraries from Ada, and such.

3. 'On the wire' integration

This is similar to (2) except that you have more processes.

Using something like CORBA, you can implement a service in, say FORTRAN, that calls the FORTRAN libraries. You can then implement your client in whatever (Java, PERL, C/C++, .NET, ...).

There are CORBA/.NET solutions, both OS and commercial, available (Borland Janeva, IIOP.NET, ...)
Re:java is the future by GuyverDH · 2004-09-20 10:03 · Score: 2, Insightful

If C / C++ is so last week, and Java is the future, then how can it be that Java wouldn't exist without C.

From Java's perspective, the two are tied together.

You (currently) can't have Java without C, while you can have C, without Java.

It's safe to assume that until another language comes around that can do things as well as C/C++, that Java will continue to be written in C.

So to say that Java is the future, while condemning C/C++ to the past is short-sighted at best, and ignorant at worst.

Where would all your lovely new "Java" versions come from, if not from dedicated, hard working C/C++ programmers?

--
Who is general failure, and why is he reading my hard drive?
Re: No, Operator Overloading is a BAD THING by digerata · 2004-09-20 10:06 · Score: 2, Informative

This was marked as insightful?
My f**k'd up example was in Java not C++. The problems you bring up aren't present in Java. We are discussing Java, not C++. You do not seem to be knowledgable in Java, hence your statement, '...do braindead things like mutating objects in place...need to specify clone()'. Clone does not mutate any object. The need for it is to make a copy so that c won't be modified when someone modifies a.
I'm not going to get into another religious debate over something that has been argued since the dawn of time.

--

1;
Interfaces by GlobalEcho · 2004-09-20 10:17 · Score: 4, Interesting

[Disclaimer: Until recently I was a quant, and among other things was responsible for the coding quality of Bank One's quantitative libraries. I am no longer there, and do not speak for JPMorgan, who now owns the business.]

There are two main considerations you have with respect to libraries of numerical routines:
(1) Having access to quick, accurate, and reliable numerical analytic routines such as singular value decompositions, FFTs, and optimizers.
(2) Having convenient and standard ways within your organization of defining vectors and matrices, as well as simple operations (e.g. dot products) on them.

To address the first problem, I think you have to look first to the quality of the numerical routines you plan to use. Paying attention to their native language or available interfaces is foolish. Would you really trust a 5-year-old SVD written in Java over something from LAPACK or NAG? I sure wouldn't, and I would never guarantee a model calibration based on it!

Thus, your numerical analytic routines will come in some hoary library, and you will have to interface to it as best you can. In many cases you could use JNI or, if that makes you nervous, have the Java portion communicate with the library wrapped in a separate process using sockets or something. But the point is, quality is more important than interface here.

The other issue is standardization of vector and matrix encapsulations etc. Here I am less opinionated, but my thoughts are roughly as follows: there are probably lots of vector/matrix implementations out there, some of which must be good. You might as well just choose one with an API and implementation you are impressed with...it's not as though you will be expecting it to do your numerical math. Sure you won't get operator overloading (if you're in Java) but having done financial mathematics in C, C++ and MatLab, I can say with a fair degree of certainty that you will use overloading far less often than you might think.

You now have a convenient standard for manipulating objects, and a quality library. Write the glue and you're done.

Oh, and for those people recommending MatLab/Octave/Mathematica etc., let me just say that most of us in finance know about them and many use them for prototyping. Python, and (ugh) VB too. But ultimately one is often asked to create a library that gets handed off to internal developers for use in one or more custom apps, which are then distributed to anywhere from 5 to a couple hundred users, and run on portfolios of thousands of securities. Even if, say, your MatLab routine didn't need licensing for each workstation and took just a couple milliseconds to run, you're still looking at perceptible delays before the user sees results.

Modern financial applications are one of those few remaining arenas in which computers are Not Yet Fast Enough.
Horses for courses by barries · 2004-09-20 10:23 · Score: 3, Insightful

When choosing a language, choose one that does what you need. Don't choose a language because it's easy or pretty if it doesn't do what you need. Moreover, if you really are limited to a single language, you are forgoing the huge swaths of comp. sci. goodness whatever language you're limiting yourself to doesn't support.

Any competent group generally needs to be able to handle a mix of languages, from C/C++/Java, Perl/Python/Ruby/etc, and the myriad of narrow languages (SQL, templating, shell & batch, HTML, lua, etc., etc.).

Perhaps you should use C, C++ or FORTRAN for the numerical portions and native Java for the general purpose portions.

- Barrie
Operator Overloading is evil, evil, evil by melquiades · 2004-09-20 10:44 · Score: 4, Insightful

Agreed that, for the single purpose of numerical computing, in certain well-controlled circumstances, operator overloading gives an arguable benefit in readability.

But dude, have you ever programmed in C++? Used STL? Blech! Blech^2! I know there are people who love these things, but the readability is unforgivable. Only a Perl code could make it look good. Operator overloading brings out the worst in developers, encourages them to be waaaay to clever for anybody's good. In C++, the evil started with
cout << "Hello world!";
(what the hell were they thinking?!) and went downhill from there. Once you open the door to crap like that, the crap will come.

Years ago, I was at a forum with Josh Bloch and Gilad Bracha where a Java numerics guy berated them for not having overloading and asked them to add it. Bracha basically said "over my cold, dead body." I'm with him on that. The greater cause of readable Java trumps the minor benefits of overloading.
Use Fortran 95 by beliavsky · 2004-09-20 10:51 · Score: 2, Interesting

I am a quant who uses Fortran 95 for the things you mentioned -- it has built-in multidimensional arrays, including arrays with complex elements and operator overloading, and it's cross-platform if you write standard Fortran 95 and have compilers for needed platforms. You can compile code to DLLs for use in Excel etc.
Try JNI by miyako · 2004-09-20 10:52 · Score: 2, Insightful

Since you've said that your department has experience with both C++ and Java, have you thought about using the Java Native Interface. JNI basically allows you to use some native methods that you can write in C++ in your java application. Sun has some good good articles on their website about it, and after spending a couple hours with it, it's pretty easy.
This will allow you to make use of a lot of pre-existing C++ code, and to write code in C++ when it turns out to be better at a particular problem, while still using Java for the majority of your application.
I've used JNI extensively for graphics applications (which are heavy on math), where it's either much faster in C++ (yes yes, java is much faster than it used to be, but sometimes much faster still isn't quite fast enough), or just much easier to solve a given problem in C++, even though Java is the best choice for most of the application.

--
Famous Last Words: "hmm...wikipedia says it's edible"
Use Nice! by bonniot · 2004-09-20 12:04 · Score: 4, Interesting

You could use Nice, which has operator overloading, generates Java bytecode, and allows you to give a syntactically pleasing interface to existing libraries. For instance, supposing there is Matrix.times(Matrix) method in the Jama package, you could declare in Nice:
import Jama.*; Matrix `*`(Matrix, Matrix) = native Matrix.times(Matrix);
Then you can write m1 * m2 and that will call the times method.
You can also use Eclipse, JUnit and Ant with Nice. Don't hesitate to ask for help on the nice-info mailing list.

--
Watch great movie opening scenes!
Re: No, Operator Overloading is a BAD THING by Anonymous Coward · 2004-09-20 12:55 · Score: 2, Informative

Actually, it would seem that YOU'RE the one that doesn't know about Java. The operations you perform with A mutates A, and after the mutations, you clone it. In your example, you'll end up with

c.equals(a) == true

presuming that you have implemented equals correctly, of course
No religion wars, please by insac · 2004-09-20 19:36 · Score: 2, Insightful

Could we please try lo list why "operator overloading" is such a troublesome feature?

The statement "since it is so easy to misuse" doesn't count: I'd like to know WHY it is so easy to misuses.

The statement "you'd better use other languages for mathematical calculus" doesn't apply either: I'm in a financial project and we use Java, and there are some pretty complicate expression even in economics.

The statement "I used it in C++ and it was a mess" is also not appropriate as an answer to my question: if Java will ever consider this feature, there's no reason why it should copy the C++ style.

On the other side (the operator overloading fans):
the statement "I'm not going to" doesn't apply; your colleague could do and you would kill him after tracing a bug

The statement "The expressive power of this feature is more important than the possible misuses" doesn't apply either: Java tries to avoid misuses by forcing programmers to behave properly and we should respect this philosophy (not meaning I'm against the feature.. only I'd like to have it without the major cons)

My opinion:
"Why it is so easy to misuse and mantain?"

1) At first glance you could not realize if the symbol "+" is a simple primitive "sum" or a more elaborate object operator

2) sometimes the notation is simply "out of this world" (ehm... meaning "not natural" :-)

Example: (let me write in pseudo-Java)
Vector v1=new Vector();
Vector v2=new Vector();
Vector v3=v1+v2; //ok, concatenation of the 2 vectors
Vector v4=v2-v1; //what the hell does that mean?

3) if we choose a C++-like implementation we could have a "operator+" (-/*) method that has its own implementation (possibly different from add() or any other method in the class)

4) if we choose a C++-like implementation we wouldn't have just one place to look at to understand the meaning of operators (they could be overloaded twice or more times in the class hierarchy)

Any other reason in your opinion?
Then when we have all the reasons listed we could consider if there could be a way (compatible with "Java guidelines") to add this feature without incurring in all this misuse problems.

If we (or Sun :-) can't find such a way, or is not justified by the advantages (cleaner syntax in economics and mathematical expression) then we would not ask for it...

Last thing:
you can vote for this feature (or stand against it ) at this URL (registration needed) http://bugs.sun.com/bugdatabase/view_bug.do?bug_id =4905919

--
This message doesn't need a sig
No, operator overloading is a GOOD THING! by Anonymous+Brave+Guy · 2004-09-21 00:07 · Score: 2, Insightful
Here are a few reasons not to do it.
- Developer productivity, in terms of finished lines of code produced per day, is remarkably consistent across programming languages. If you insist that a trivial expression be written as five lines of crap, you just reduced your developer productivity to 20% of what it was. (Before anybody flames, please read the research. Google is your friend.)
- Replacing a simple and transparent expression with five lines of crap makes the code vastly harder to read. There is far more scope for introducing no-brainer bugs, and it will be far harder for anyone reviewing the code to identify and remove them.
- In many languages, using a consistent syntax to represent the same logical operations allows you to write generic code that can work on all types supporting that syntax. In most parts of the programming world, we call addition "+" and multiplication "*". Pointless diversity just hinders code reuse in one of the few areas where it actually is more than just a buzzphrase.
We use high level languages instead of assembler because they let us work at varying levels of abstraction, keeping what we're doing relatively simple at each level and delegating the details to the levels below. That makes for more readable, less error-prone code. What you're advocating is the very antithesis of this approach; if you're going to be that clumsy, you might as well write in assembler. In fact, on reflection, that would be neater...
--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
cout << "Design error"; by Anonymous+Brave+Guy · 2004-09-21 00:12 · Score: 2, Insightful

The problem with using << for C++ I/O streams isn't really the use of operator overloading, it's the fact that it puts into code what should be data: the order of the terms to be output. As anyone who's worked with internationalised code much can tell you, that's a "D'oh!" mistake.

As for readability, I write serious maths software using C++. We already use complex matrix multiplication expressions and the like, which are hard enough to read already when you're constrained to a textual representation. From a numerical programmer's perspective, you can have my overloaded (and highly readable) operators over my code, dead body.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:Okay, mom.... by wirde · 2004-09-21 02:38 · Score: 2, Insightful

C is *not* a functional language
http://www.google.com/search?hl=en&lr=&ie=UTF-8&oi =defmore&q=define:functional+language

--
in GNUin GNUin GNUin GNUin GNUin GNUin GNUin GNUSegmentation fault
Jakarta Commons Math & other stuff by andyverbunt · 2004-09-21 06:04 · Score: 2, Informative

First, you might be interested in Jakarta Commons Math, which is about to release version 1.0 : http://jakarta.apache.org/commons/math/index.html

Secondly, I'd probably consider isolating all the formulas and then put them aside somewhere (XML, database, ...) in a human-readable format.
Then make a parser that can read that format (i.e. using the libraries you mentioned), substitute variables, and calculate a result. The advantages that I see:
1) you centralize all numerical stuff
2) in a readable format
3) so operator overloading (or the lack of) will only bother you in the parser
4) it will be easy to change or add formulas without having to recompile everything
5) easy to write tests (junit)
6) easier to change underlying math-libraries without affecting the rest of your code.
not just operator overloading by jeif1k · 2004-09-21 20:02 · Score: 2, Insightful

In addition to the lack of operator overloading, there are other problems with Java for numerical computing. For example, it doesn't have "complex" and similar data types, and it has no means by which you can define them yourself efficiently either. Also, Java does not have true multidimensional arrays.

The C# language is considerably better for numerical computing than Java. However, C# implementations are still a bit behind Java implementations (although they seem to be catching up fast).

I would recommend sticking with C++ for now and waiting another year to switch to something else. C# will probably mature to the point where it is a reasonable choice.
Screw the language wars ... by ManikSurtani · 2004-09-21 20:52 · Score: 2, Interesting

Here's something that's potentially useful.

Jakarta's Commons Math library (http://jakarta.apache.org/commons/math/) has some interesting classes (including handling of complex numbers and lots of statistical stuff). I haven't used it in anger and hence do not know the extent of their support for the features you are looking for, but it is a good start. It is also designed to be a lot faster than Sun's math APIs.

And yes, they're all objects and there is no operator overloading. And I reflect sentiment earlier about how this is a Good Thing in general.

--
-- Manik Surtani
I disagree completely with this statement by fenris_23 · 2004-09-22 05:57 · Score: 2, Insightful

From a mathematical point of view, the prefix notation represented by a function with arguments makes much more sense than the infix notation represented by operator overloading.

Operator overloading only makes sense in a small number of cases where the class you are developing only provides binary and unary operations. There are many more cases where a function should be tertiary or more. In these cases, you have to abandon operator overloading and use the same functional notation anyway. Also, sometimes operator overloading doesn't even make sense. E.g. How do I overload the * operator for vectors?

In the end, when developing a library like that using operator overloading is going to have to use inconsistent representation for operations - which is just ugly - imo.