OCaml vs. C++ for Dynamic Programming

← Back to Stories (view on slashdot.org)

OCaml vs. C++ for Dynamic Programming

Posted by timothy on Monday March 14, 2005 @11:35AM from the picking-a-fight dept.

jcr13 writes "OCaml is nearly as fast (or sometimes even faster) than C, right? At least according to the Computer Language Shootout [alternate] (OCaml supporters often point to these shootout results). My results on a real-world programming problem (optimizing a garden layout using dynamic programming) disagree. On one particular problem instance (a garden of size 7x3), my C++ implementation finished in 1 second, while the OCaml implementation was still running after 16 minutes. Bear in mind that my OCaml implementation was dramatically faster than my equivalent Haskell code. It seems that if you program using a functional style in OCaml (which I did, using map, filter, and other recursive structures in place of loops), it is quite slow. However, most of the shootout OCaml programs rely heavily on OCaml's imperative features (unlike Haskell, OCaml doesn't force you to be a functional purist). If you write OCaml code that is isomorphic to C code, it will be fast---what about if you use OCaml the way it was meant to be used?"

11 of 161 comments (clear)

Min score:

Reason:

Sort:

Re:Hmmm ... by crmartin · 2005-03-14 12:30 · Score: 2, Informative

Look down a little farther ... I commented specifically on his implementation.

But here's another hint: the fact that there's a fuinction that says "memoize" doesn't necessarily mean it works.
Re:Hmmm ... by Anonymous Coward · 2005-03-14 13:59 · Score: 1, Informative

How long does it take for one to learn that you should be humble enough to make sure the program you wrote even halts before publicly complaining and posting results?

Furthermore, looking at the ml code shows some questionable practices. He reinvents the map function, which isn't nessecary and could cause relative slowdowns if the ocaml library version has been optimized. It shouldn't take 16 minutes, though. I don't have an ocaml compiler installed at the moment, so I can't really verify and delve into why his program is going so slowly.

Certainly, his adventure is a testament to how difficult it can be to migrate from one language to another, if not a testatment to the poor performance of Caml.
Haskell code? by Pseudonym · 2005-03-14 16:00 · Score: 4, Informative

Can we see your Haskell code?

Haskell is not known for raw speed, but dynamic programming is probably the one thing it does well, thanks to lazy evaluation. You fill a CAF with unevaluated function calls, and the language engine does the rest. It won't be as fast as the hand-crafted C++ version, most likely, but if your O'Caml code is anything to go by, it might be able to be improved.

--
sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
Your problem(s) by Anonymous Coward · 2005-03-14 16:35 · Score: 0, Informative

Your problem is that you don't understand how OCaml was meant to be used. We do not have a variety of data structures and imperative features so you can go use lists when arrays are appropriate. This is not Haskell. Furthermore, you redefine numerous standard library functions... and not only that, your versions aren't tail recursive? Why would you do such a silly thing? No wonder your code is so long.

Basically what you need to do is sit down, stop being such a dumbass, read the manual, and don't pretend your false conceptions of how OCaml is "meant to be used" (your code is stylistically and algorithmically awful) are in any way relevant to its actual performance. If you wrote similarly stupid code in C++, it would be just as slow.

You can actually get a constant 8x speedup if you fix your memoization, but the real problem lies in the algorithm itself. Unfortunately, the code is unsalvageably bad.
It's always possible to tune your inner loops by Tom7 · 2005-03-14 17:00 · Score: 4, Informative

One of the joys of programming in ML is that you can write most of your code in a really nice, functional way, and (if necessary) put in the effort to write a tight inner loop, perhaps in an imperative style for speed. I don't see this as a disadvantage, and ML compilers often do a better job of optimizing such loops than C compilers, in part because of more information being available in the type system. (And if they don't, it's trivial to invoke C subroutines.) Also, if performance is really an issue, you might try mlton (which is for SML, very similar to Caml); its whole-program approach often produces significantly better code than O'Caml.

However, as an every-day ML user I find it very unlikely that your program would be a thousand times slower if you're using it "the way it's meant to be used." I am guessing that your implementation is asymptotically worse, since using map and fold correctly should really only be a constant factor slower than C, at worst. (mlton can often inline and optimize these into essentially the same code you'd write in C!) How about posting your code?
Re:My realworld results differ by ijones · 2005-03-14 17:54 · Score: 5, Informative

It's not actually the case that Haskell "forces" functional purity, at least not in the way the submitter seems to think. You can do things that are a LOT like non-pure functions, you just have to use Monads. You have the so-called "unsafe" functions, which perform side-effects in otherwise "pure" functions.
So you might ask, "If you're going to write code like that in Haskell, why not just use C++." The answer is because even when using Haskell in a non-idiomatic way, Haskell is still more beautiful :)
Monads are a means of threading "stateful" code in a very clean and predictable way through your programs. The parent's comment, "Very localized state (in one function) is easy enough, but anything more pervasive and you soon become more familiar with monads than you ever wanted to be," is sorta like saying, "You can write high-level code in C++, but you will soon become more familiar with objects than you ever wanted to be."
They are indeed a part of the language, and definitely a new concept, but monads aren't nearly as confusing as people seem to think, certainly not more confusing than objects, it's just a reputation issue that makes people think monads are confusing. Take it from a random-joe hacker like me. You don't need a PhD to perform IO in Haskell.
For instance, here's a basic implementation of 'cat' in Haskell:
import System.Environment(getArgs) main = do {a <- getArgs; lines <- mapM readFile a; putStr (concat lines);}

The code:
a <- b
is similar to assignment.
getArgs just reads in the command-line arguments as a list, so 'a' represents a list of the filenames.
readFile takes a file name, reads the contents, and returns it as a list of lines.
mapM means 'perform this computation once for each item in this list'
putStr is obvious, concat just takes a list of lists and turns it into a single list.
There's a paper, Tackling the awkward squad: monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell about how to do these kinds of "real-world" things in Haskell.
There's also a very cool version control system called darcs that's written in Haskell, and recently an implementation of Perl 6 called Pugs in Haskell.

peace,

isaac
Re:OCaml is supposed to be faster? C++ is dynamic? by xenocide2 · 2005-03-14 18:11 · Score: 2, Informative

Ironically, the OCaml version posted uses a list rather than a hash table. I'm busy making dinner right now, but I'm certainly interested in seeing if that can be improved upon.

Perhaps this is a case of problem solving by public contradiction?

--
I Browse at +4 Flamebait
Open Source Sysadmin
Bad string manipulation by vanicat · 2005-03-14 20:40 · Score: 4, Informative

I've not read all the ocaml code, but I've seen at the very begining this:
let rec printCells cs = match cs with | [] -> "" | (c::rest) -> (printCell c) ^ (printCells rest);;
And I know that this program will be slow. The ocaml string concatenation operator copy both string each time it is called, and this concatenation work will take O(n^2) step.

You should use the Buffer module, or String.concat:
let rec printCells cs = String.concat "" (list.map printCell cs)
If there is a lot of those mistake, no wonder it is so slow...
Re:My realworld results differ by Anonymous Coward · 2005-03-15 06:27 · Score: 1, Informative

Ocaml doesn't support any ad-hoc polymorphism (overloading) whatsoever in functions.

This is an annoyance more than anything else... e.g. writing "+." for floating point addition instead of "+"...

Methods on the other hand can be overloaded, but not generic.

Actually, the language _has_ been extended recently to allow a restricted form of generic methods. For example, you can now do fold's using a syntax like this:

class ['a] list l =
object
val mutable _data = l
method fold : 'b . ('b -> 'a -> 'b) -> 'b -> 'b = fun f init = List.fold_left f init _data
end

Having to differentiate between HashTable.insert and SkipList.insert sort of defeats the purpose of abstract types, because no one thought to make module signatures themselves first-class

Modules are not first-class in the language proper, but they are first-class in the functor language. So you can abstract over Hashtable.insert and SkipList.insert by using:

module type DICTIONARY =
sig
type t
type key
type value
val create : unit -> t
val insert : t -> key -> value
end

module Foo (D : DICTIONARY) =
struct
let foo () =
let dict = D.create () in
D.insert dict ....
end

This isn't as powerful as what Alice does (the core language and functor language are more closely integrated in Alice), but, of course, you could get similar effects by using objects in OCaml...

Ocaml isn't seeing a lot of new work going into it -- the language definition seems to have become cast in stone.

Not quite: generic methods are relatively new (added, I think, last year). The functor language has also been extended a bit in the last little while...
Re:Email response by Fahrenheit+450 · 2005-03-15 06:45 · Score: 4, Informative

Okay... but here's my point: Every single example that shows how elegant Haskell and OCaml are uses lists. The 4-line Quicksort example for Haskell uses lists. All of the code that demonstrates easy reuse of functions and functions taken as arguments uses lists (like how easy it is to implement quite complicated algorithms using only map and filter, for example).

Or perhaps more correctly, "every single example that you've seen". For a real quick one, look at Jason Hickey's Intro to OCaml (pdf) and have a quick peek at his Red/Black tree implementation. Or even cooler (if you're into that sort of thing) is the ever famous One Day Compilers talk.

But what you're saying in #1 above is that in "production," speed-sensitive code, no one is using lists... this would mean that no one is using map, filter, or any other pieces of reusable primitive code. So, they are instead all using mutable data structures... I.e., they are programming with side-effects and loops (random access instead of recursion, even when ever element of an array/list needs to be accessed/processed).

No. What he's saying in that you should use the best data structure for the job. Your best bet would have been to use the Hashtbl module from the standard library, or if you wanted to stay in the purely applicative, the Map module (also in the standard library) would have been loads faster...

You are aware that there are more purely functional data structures (pdf) (OCaml implementations) than the list, don't you?

So... maybe you can re-write higher-order memoization code using more efficient data structures? I would love to see that code, and I'm sure the OCaml community would benefit from having that in their toolbox.

Here's a pretty neat example that uses arrays in a naive way, but you could certainly use, say, a map instead... And I'm pretty sure the OCaml community (by which I mean the people who would have helped you improve your code had you asked them) know about things like this.

I don't think I spread any falsehoods. I mean, my experiment was real, and the results are real, and the code is there for people to inspect and try on their own. I also talked in my /. post about OCaml code that is isomorphic to C being fast, but functional code perhaps not being fast.

Yes. And we inspected it, found it to be poorly written, and told you so. The "falsehood" here is that you claim that code written in a functional style is slow, when you really should have said "my code written in a naively functional style is slow". If I fill my gas tank with water, my car sure is slower than walking, therefore all cars are slower than walking... right?

Trust me... I am *dying* to use OCaml or Haskell for real-world programming. I have spent the past month or so exploring these languages and trying to apply them to real programming problems. Especially when shootout results showed that OCaml was sometimes faster than C, and when I discovered that OCaml was much faster than Hasell, I was really starting to think that OCaml was a possibility.

I put a link to tho OCaml mailing lists above. Use it. Ask questions of the list (you may want to start with the beginner's list). They can help you learn the language faster and better than google will.

However, the ONLY reason why I would want to use OCaml is to take advantage of the expressiveness of pure functional programmin

--
-30-
OCaml Evalation Is Strict, Not Lazy by j+h+woodyatt · 2005-03-15 11:34 · Score: 3, Informative

If you want to program in a functional style, and you need lazy evaluation, you're going to find the standard library that comes with the compiler somewhat limited.

I wrote some extensions for programming in OCaml in the functional style. Check out the OCaml NAE project, and look for the Core Foundation (Cf) package.

--
jhw