Can You Spare A Few Trillion Cycles?
rkeene517 writes "11 years ago I did a simulation of 29 billion photons in a room, that got published at SIGGRAPH 94. The 1994 image, called Photon Soup is
here
.
Now computers are 3000 times faster and I am doing it again only much better, with a smaller aperature, in stereo, with 3 cameras, and with some errors fixed, and in Java.
The 1994 image took 100 Sparc Station 1's a month to generate.
I need volunteers to run the
program for about a month in the background and/or nights. The
program is pure Java." Read on for how you can participate in the project.
"The plan is to run the program on a zillion machines for a month and combine the results. All you have to do is run it and when the deadline arrives, email me a compressed file of the cache directory. So email me here and I'll send you the zip file. The deadline will be June 1st 2004.
The running program has a More CPU/Less CPU button. Every half hour it saves the current state of the film. The longer and more machines that run this, the cleaner and sharper the image gets. If you have a lot of machines, I can give instructions how to combine the results so you can send in a single cache directory.
Of course, you will get mention in the article if it gets published."
Emailed this to the editor, but something must've gone wrong.
The URL to the photo soup image is missing the 'www'. The image can be seen here (you may want to do a 'Save Target As', as the mime-type seems to be a bit off).
Don't sweat it, I got a copy of the linked image.
:-)
The original was silly, too -- it sent back the image as Content-Type: text/plain. I don't understand it at all, but since the server is toast now, it doesn't really matter either way.
Or FORTRAN. FORTRAN is more traditionally used in scientific computing anyways.
The link to the image should be http://www.cpjava.net/raytraces/DRUN.GIF (The www is necessary and was left out of the link in the article.)
People are already cracking jokes about how the fact that it's in Java will mean that it will run a lot slower than it could. While I love to pick on Java as much as the next person, I am curious how much it actually makes a difference for raytracing - does anyone know? My experience with numerically-intensive algorithms is that Java is 2-4x slower than C. You can get it within 2x of the speed of C if you ignore object-oriented programming and you're really good at Java optimization, but that's it. And it will run much slower on some architecetures because Java guarantees certain floating-point operation semantics at the expense of speed.
If I were writing a new numerically-intensive program from scratch that I wanted to use for a cross-platform distributed computing project, I'd probably do it in Numerical Python (NumPy) - my experience has been that it can be within a factor of 2-3 of the speed of C, but it's much more concise, requiring half as many lines of code as Java or C to do the same thing. And these days Python is just as cross-platform as Java - it definitely runs great on Mac, Windows, and Unix.
Either I'm suffering deja vu, or this has been posted nearly verbatim before in a previous discussion of Java vs. C.
Astounding.
A much larger version of the SIGGRAPH `94 image "Photon Soup", clocking in at 840x560, can be found HERE.
Your site is returning gibberish on my Mozilla, and here's the wget output...
.gifs as plain text file and screwing up browsers.
[snip]
Found www.cpjava.net in host_name_addresses_map (0x8074330)
Registered fd 3 for persistent reuse.
Length: 71,283 [text/plain]
[snip]
Apparently your server is sending out
Numerical Python is great, but not necessarily suitable for this task. It's good when you're performing the same operation on the items of a vectors. When the vectors are long enough it indeed approaches the performance of C code. But in ray tracing every photon can take a different route depending on what it hits. I'm not so sure Numpy would perform nearly as well in this case.
Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
Not really that slow, depends on what you're doing. At work we're using a CPU intensive Java based "optimizer" that runs a hybrid Genetic Algorithm. We also have a very similar version that was coded in C++. Chances are that the C++ coders sorta sucked, but the end result has been pretty much the same. The only difference was that our Java application actually DOES run on many platforms, including Windows, SuSE, and MacOS X, without a problem. And I can't say it enough times, it's just as fast as the C++ version that only runs on Windows .
Dude, is your C compiler that bad? I like Java a lot, and use it for compute intensive applications, but I think you're either pretty bad witha c compiler or trolling. if you're doing something CPU intensive in C, you need to use gcc -O2 (or -O3, depending), with -march=cputype. This will allow gcc to generate exactly the same code you just described, since it is not limited to 386 instructions. And if you need even more performance, you can just use Intel's C compiler for a lot of things (non-commercial is free as in beer), though it doesn't support some GNU extensions and I think has trouble with some things like the Linux kernel.
I added "www." to the URL, it works.
Try this...
Trolling using another account since 2005.
My experience with numerically-intensive algorithms is that Java is 2-4x slower than C. You can get it within 2x of the speed of C if you ignore object-oriented programming and you're really good at Java optimization, but that's it. And it will run much slower on some architecetures because Java guarantees certain floating-point operation semantics at the expense of speed.
The speed difference oft cited is about 20% on numerical apps. Check out http://www.idiom.com/~zilla/Computer/javaCbenchmar k.html. He brings up "
Benchmarking Java against C and Fortran for Scientific Applications as well.
You have to remember that Java's speed disadvantage is mainly in the JVM startup and GUI areas. Although a good Java dev team can make Swing fly ( checkout JBuilder for instance ).
Java being Just-In-Time compiled can even take advantage make runtime optimizations that your C/C++ application may not.
Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
Sorry, but the 80386 has 32 bit stack and move operations. Generally, people compile their program for 80386, because almost all optimizations that can be done automatically for Pentium does not harm performance on 80386.
If your program has a noticeable performance benefit from using SIMD instructions, you can move the relevant functionality into a shared object, and distribute the program with several versions of it, and dlopen() the correct one at runtime. The absence of programs that actually bother doing this, can serve as an indicator as to how big the performance benefit from SIMD optimizations really is.
Does that explain it better?
The world will end in 5 minutes. Please log out.
Ray-tracing traces rays from the eye towards the lights and uses this to simulate how bright a light you see for each pixel. This is great for some things, but isn't realistic, as many lighting effects can't be simulated. Radiosity fills in some of the gaps, but there's still a lot missing. For example if you shine a light onto a mirror at 45 degrees (specular reflection), it won't make light shine onto a diffuse surface at 90 degrees to the mirror, which it would do in real life.
Photon simulation does the opposite of ray tracing and traces the paths of photons leaving the light source and calculates where each photon would hit the view plane if at all. It takes a lot of calculation because you have to send of millions of photons in all directions from the light sources. You can simulate all known light effects this way, just very very slowly. In the image you can see light shining from the light source onto the rear mirror and bouncing off onto the diffuse surfaces at the side.
Both are great for parallel processing because each photon/ray is pretty much independent of others.
Eh... .NET is natively compiled when it's run for the first time. It also optimizes for the platform (even CPU) it runs on.
Supplies!
The Great Win32 Computer Language Shootout
While Java is not "unacceptably slow" or "1000 times slower" as some claims, it is generally slower than C and much more resource intensive nevertheless.
Actually if one wants to write this kind of math intensive apps, pure Java is really not the best choice. Even pure C isn't. He should think about implementing some of the highly used routines in assembly (no joke). And since photon tracing can be done parallelly, one would find the SSE and 3DNow! families of instructions useful.
And finally, besides the CPU, you can also try to do the calculations in your GPU. You'll need a new-fangled PCI-X card in the future to do the calculations efficiently tho.
Take a look at this site BrookGPU
This presentation explains the problems with Java floating point.
Incidentally, C99 has very nice support for IEEE 754 (improved numerics support was, in fact, one of the biggest additions compared to the old C89 standard).
As founder of the Distributed Hardware Evolution Project which is written in Java, I'd like to remind you all that the Just-In-Time compiler coupled with the real time profiling and dynamic on-the-fly optimisation that goes on in the Server VM makes the difference between C and Java minimal for code which is in the critical region. This is specially the case for code which is executed over and over again, such as with these distributed processing projects. In fact the guys at Sun are doing such a good job at exploiting the ever more complex characteristics of different processors that Java code is expected to run faster than C in the future. Also, during the weeks that you would spend debugging and porting your C code, your Java code has gone miles ahead doing useful stuff! If you would like to start your own Java distributed processing project, DistrIT might help.
1 Month on 100 sparcs? Peanuts! In my research simulations usually take (depending on the problem) up to 6 months on an average of 150 workstations (and some runs on large clusters). You wonder what I do? Spin glasses!
Spin glasses are systems in with the interactions between magnetic moments are in conflict with each other. These competing interactions make these systems extremely hard to simulate at low enough temperatures. If you have a linux box sitting around idle which is fast enough, let me know and I will provide you with some samples to run. Current project: 100 - 300 samples, each takes ~ 10 days on a 2.4 GHz Xeon... For information on how to contact me, go to duamutef.ethz.ch. Of course your name will be mentioned if you compute a considerable number of samples!
You really should catch up on current technology. Java is only interpreted at program startup. It is compiled down to native code just like C++ is, except the compilation happens at runtime when there is more information available (e.g. which method calls are really virtual and which are not). So, yes, the first couple of seconds of your program are interpreted but the rest of the time the speed will be just as fast as C++, sometimes faster.
Hiya, this would be great. I'm currently about to extend DistrIT to make a system so that researchers can book CPU time from all the unused cycles of my Department's PCs. It could also be extended quite easily to do what you're saying, and have a central server where you get credits for donating your CPU cycles and then you can upload your processing task and get other people in the pool to run it.
Ok. My assembly is a little rusty, so bear with me. Let's say we have equivalent Java and C programs. They both have to run on a 386 or higher. (Bear with me. I haven't kept up with the MMX/SSE/SSE2 instructions, so I'll have to fake this a little.) Now, your C compiler will see that you want to store a 32 bit value, but has to generate code for a 386. So, it generates the code:
pop AX
STOSW 0x0005
pop AX
STOSW 0x0005
Even though the code may be running on a Pentium Pro (which is optmized for 32 bit code), it's still going to execute those 4 statements.
Now, the Java Hotspot compiler will start and notice the fact that you're running on a Pentium Pro. So when it converts the bytecode to machine code, it creates the following instructions:
pop EAX
STOD 0x0005
That's twice as fast as the C code!
Real code would tend to be running on modern processors, so this example is a little contrived. However, the JVM can (and will) use SSE instructions to do multiple calculations in one instruction, while the C code will be forced to generate non-SSE instructions to support the old Pentium Is out there.
Hotspot is also capable of analyzing the running code and regenerating even better assembly that would perform poorly in other circumstances. For example, let's say Hotspot notices that the bounds can't be exceeded on an array. Well, Hotspot will then recompile to remove the bounds checking.
Does that explain it better?!
--
This post is open source, retransmit as desired.
"Now, your C compiler will see that you want to store a 32 bit value, but has to generate code for a 386. So, it generates the code:"
Compilers are quite capable of finding out or being told what architecture to compile for and this includes various x86 types. Sorry , but your argument is invalid.
The parent is funny because exaggeration is funny. Java isn't really slow anymore. Most of its slowness is attributed to load times, but that's the price for garbage collection and the like. When it comes down to it, Java can get things done. C is still faster in most cases, no doubt, but with today's processors, a mouse could starve on the difference.
This might have been a problem back in the days, but today most computers are fast enough to make flac/oggvorbis/mp3's quicker than you'll be able to transfer the .wavs to a faster computer, let alone compress them and then tranfer them back.
Most (all?) universities already have an authentication-system in place that is used on campus-computers, both for local and remote login. This can be applied to grid-computing, too. That way you can punish those who abuse resources like you already do if somebody decides to convert a couple of your loginservers into a CS/HL/Tetrinet/*-gameserver..
At my university, the charge for CPU-cycles on the high-performance clusters isn't really related to the actual cost of cycles as these are dirty-cheap today, but the cost of administration of the system. In the case of using desktop/login-machines for grid-computing this should already be covered by the "dayjob" of these computers.
The parent post is stolen, word-for-word from this post by SharpFang (651121).
It was stolen via the anti-slash.org database
Mod parent down.
Opinions on the Twiddler2 hand-held keyboard?
He's mapping out millions of independant random photons. A handful of stray photons out of millions would have no visible effect. It's not a problem unless you submit a pretty massive number of bogus results. And if the effect was noticible it could probably be traced back you your currupt batch and simply tossed. Tossing data does not currupt the result, it just makes the final image a tiny bit dimmer.
-
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
His argument isn't invalid.
rkeene517 is asking for volunteers to run the program. If rkeene517 does it in C/C++ then rkeene will have to compile for the different x86 types and ensure that volunteers download the right binary. From experience you end up getting a high number of people getting the generic x86 binary instead of the optimized one because in order to avoid zillions of support queries, you have a "If you are unsure, click on i386".
Of course you could bundle all the various binaries and add code or a binary that figures out what x86 it is and runs the relevant binary.
But that involves a step more than what you suggest.
His argument is quite valid.
Sure, you can configure compilers as narrowly as you like, but in most cases, compliation will be targeted at the lowest common denominator.
If your compiling for yourself, you have the luxury of building for your own CPU. This isn't the case here.
Why do you think Linux binary rpm's, for years, were compiled for 386 chips. It's only recently that some (all?) distributions have started distributing 586 based rpms.
The point is, Java can make this decision at run-time, and hence target the actual CPU. C++ code can not (without a lot of pain, at least).
Use SWT instead of Swing and you won't notice any difference (on mac and windows) and little difference on GTK (they're working on it ;)
Startup time is still an issue, but for most apps it's not much of a problem.
Everyone works on the whole picture at once, but simulates different randomly-emitted photons. All you need is a different random seed for each client, which is trivial to manage.
Use JDK 1.4 or 1.5 on Windows and use :-)
the RC1 of the Blackdown JDK 1.4.2 on Linux and then revert your statement
Over here on my machine pure Swing on Blackdown 1.4.2RC1 is faster than GTK2 but a tad slower than Qt.
On Windows there is no noticable difference between Swing and native Controls anymore speedwise.
Both use hardware acceleration functions of the underlying graphics card, so no big deal here anymore.
Now: assuming the LED is 100% efficient (close enough) and that your pupils are (eg.) 1 square cm at (eg) 1m from the source, you can calculate how many photons / second were entering your eye.
Of course we were young (hmm, 17?) and this is only an order or magnitude experiment, but most people could go down to somewhere around 1 - 10 photons / second.
Photon Simulation: "forward ray tracing". Emit photons from light sources, have them bounce off scenery, and see where they hit the eye.
Ray tracing: "reverse ray tracing". Emit "sight rays" from the eye, have them bounce off scenery, and see where they hit light sources.
That's simplifying, of course.
Forward ray tracing was here first, but was quickly (all but) abandoned, since it is computationally way more intense (why, 10 years ago, it would have taken 100 Sparc Station 1's an entire month just to calculate one small image!). Imagine simulating a zillion photons just to discover that over 0.99 zillion had failed to reach the eye...
On the other hand, it is physically more correct, since effects like caustics are very difficult to do right in "reverse" ray tracing.
Sure, it's hard to do by hand. But I know of at least one compiler (Intel's C++ compiler) that can generate multiple versions of assembly code and dispatch on processor type at runtime. That compiler is much better than Visual Studio for numerical computation.
here
This comment does not represent the views or opinions of the user.
several things....
1) Programmer time goes for about $50/hr. That means that a prrogrammer spending 20+ extra hours could have pretty easily bought you an extra dual CPU computer to run the thing on. That's only 2/3 days of work. Java can easily shave off much more than that.
2) Java isn't that slow. Depends on what you're doing and how you're doing it, but it's not crazy to get java to be less than 50% slower than C. It's also not really uncommon for it to be faster. When it is faster it's almost always because better algorithms are used, but that isn't an accident. It's much easier to write good algorithms with a garbage collector sometimes, as you don't have to track down and delete all the stuff you unlinked.
3) The one weakness of java (until VM sharing becomes available) is memory usage, but memory is really cheap now, same basic logic as CPU time, but even more so.
4) Lots of additional optimizations are possible in VM based languages that aren't tried by any modern VM. When they start to come online, expect the performance of the VMs to surpass compiled code. Here are some examples....
a) Escape analysis: all stack frame scoped data goes on the stack. Basically it makes optimal use of the stack, can't really be improved any. This is why C#'s "value" types are so stupid, they shouldn't be able to help (and would probably hurt) a good VM. Anything larger than a pointer should be a reference, the VM can put it on the stack if it's possible to do so.
b) Method virtualization: a good VM should strip down pretty much all of the V-tables and just regenerate what it actually uses. This is why the "virtual" keyword in C# is so stupid, it should have no effect on performance assuming a smart VM. Can also do all sorts of inlining that a normal compiler can't do (someone could link to your library, you can't inline away public functions).
c) "incorrect" optimizations: The VM can create optimal code that is not actually a valid representation of the given code for all inputs. Can then revert the code if an input is given for which it is not valid.
d) Profiling: a VM can (and modern ones do) profile the code and optimize the common cases at the expense of the uncommon ones.
e) Hardware knowledge: a VM can always produce code that is optimal for exactly your hardware, right down to cache sizes, processor model, and memory latency.
Just though I'd throw these things in. Those who expect VM based languages to always be slower will probably be in for a shock in the future. Remember, the cost of compilation is basically constant whereas the payoff from optimizations is linear in CPU speed. At one point the optimizations will exceed the cost of compilation. It's only a matter of time.
For instance, let's say you have an interface I, and a class X that implements I. If X is the _only_ implementation of I loaded at the moment, then all calls to methods on I can be direct, non-virtual calls because there's only one choice! In fact, HotSpot will even inline the method calls if it decides it will be beneficial.
But then a class B is loaded. HotSpot will de-optimize the inlined and direct calls to methods on I.
There are many more examples, such as loop bounds-checking elimination, and other things HotSpot can do because it sees the state of the running system.
If you've used a slow Java program, it's no doubt the result of a poor design and coding job by the programmer. "I'll just pick up Java for Dummies in 24 Minutes. Now I'm a 1337 j4v4 h4x0r!!" You may also have been using an old, slow JVM. The performance increases between Java 1.2, 1.3, and 1.4 are truly awesome. Also, Sun's Java 1.5 starts up on my machine in less than half the time that 1.4.2 did, and the graphics as OpenGL accelerated now, ... the list goes on and on. For anyone who had used a Java IDE, especially NetBeans/Forte (which I like, except that it's so freakin' slow I fall asleep between operations), you must try IntelliJ IDEA. It is so responsive and just a joy to use. On the systems I've run it on, it is significantly more responsive than Eclipse.
Dr Superlove 300ml. I use my powers for awesome
A reasonable explanation. There is a new technique called photon mapping too. Photons are emitted from the light sources much like this guy is doing. However, each time a photon hits a surface, the impact is stored in a big data structure - the photon may then be reflected depending on surface properties. This "photon map" is view independant. Rendering an image then consists of doing ray tracing from the eye into the scene, but to calculate surface illumination where the ray hits a surface, you use the photon map. This avoids throwing away the 99.9% of photons that don't strike the eye. It can be made physically correct as well. BTW to see how fast regular ray tracing has become check out rtChess .
The old 1994 picture is of a cubic room with mirrors on the near and far walls. The 'bubbles' are refletive spheres. A beam of light comes out of the left wall, hits a prism and forms a spectrum on the right wall. The depth of field is very shallow so only objects exactly on the focal plane are in focus. The black fuzzy blob is the camera aperature, out of focus, being reflected in the far mirror. There is an error in the image. The corners of the room are bright and should not be. This is due to a poorly chosen diffuse scattering model. The current project is an almost identicle setup, with 1/4 as big of aperature. I have done about 1 billion photons on my 3 computers, and the new image looks much cleaner. I expcet it will take about a trillion photons to make a realy smooth image.
Inside every complex program is a simple solution trying to get out.
This thread is already full of very knowledgable people expoudning at great length as to why Java is not slower (and infact, is often faster than "native code"). Therefore, I will not waste my time writing an indepth response to those who would argue that 1 + 1 in Java is somehow slower than 1 + 1 in C/C++. This post does that quite well. What that comment does not do, however, is explain why some Java programs do, in fact, feel slower than native programs.
I'll simplify this as much as I can without diverging from the technical truth too much. Most complaints that Java is slow come from two sources. First, you must wait for the virtual machine to load, and depending on the libraries used by the program, that can be costly in terms of IO, which is always very slow. Second, Java's GUI toolkits are fairly heavy weight--they do a lot and many programs take advantage of much of the functionality they provide. I won't embark into the details, but to those inclined to find out why should read more about Swing and what Java2D libraries offer. Because of all they do, many Java programs with GUIs feel a little sluggish. Of course, keep in mind that most software sits idle 99% of the time while the user decides what to do. So otherwise, Java code that is not bound by user response time is very fast.
One quick post script: because the Java language is object oriented, complex software will do a great deal of memory allocation and garbage collection as objects come in and out of use. That too, is very expensive. However, there is no reason that you have to use the Java programming language to code for the virtual machine. Case in point: Jasmin. In theory, you could write compilers that generate JVM bytecode from any language (and a former professor of mine is currently in the proceess of writing a book that explains precisely how to do that).
Join Tor today!
I will be combining the result in a graphical tool that lets me look at each submission before it gets summed into the result. This will at least prevent some pr0n picture from getting merged in. Also a few bad submissions or duplicates will not throw off the total results.
Inside every complex program is a simple solution trying to get out.
You can view individual photons with a Spinthariscope - that web page has a good description, and it's $25.
HIV Crosses Species Barrier... into Muppets
This is a very special case he has - HPC. It's one of the few that actually do benefit from hand-tuning optimizations in some places.
The asm optimization is optimizing algorithms. It usually is the last stage, when you still need to squeeze more speed after all the high-level optimizations. Not very often, but it still happens (see openssl use of asm for instance; heck, even ATL/WTL - that's a Windows C++ template class lib - uses a couple of asm lines).
So no, asm is not dying outside compiler writers
Second, Java's GUI toolkits are fairly heavy weight
This is probably why SWT came about (in part thanks to IBM).
The first application to use SWT, Eclipse, doesn't feel like a java application because it's using native widgets, which gives the GUI a very snappy response.
If the only strong reason you have avoided programming applications in Java is because of their slow GUI response, I suggest looking into SWT. =)
Please consider making an automatic monthly recurring donation to the EFF
I really hate to pollute this wonderful discussion with actual facts, but this issue has actually been studied and numbers are available:
"An Empirical Comparison of Programming Languages
Lutz Prechelt, An Empirical Comparison of C, C++, Java, Perl, Python, Rexx, and Tcl.
80 implementations of the same set of requirements are compared for several properties, such as runtime, memory consumption, source text length, comment density, program structure, reliability, and the amount of time required for writing them. The results indicate that, for the given programming problem, which regards string manipulation and search in a dictionary, "scripting languages" (Perl, Python, Rexx, Tcl) are more productive than "conventional languages". In terms of run time and memory consumption, they often turn out better than Java and not much worse than C or C++. In general, the differences between languages tend to be smaller than the typical differences due to different programmers within the same language."
By the way, I'm going out on a limb here, but I have a feeling that someone who's been published in SIGGRAPH (THE graphics conference) is aware of the state of rendering algorithms in general, and of the existence of photon mapping in particular. Just a guess.
"Never let your sense of morals prevent you from doing what is right" -Salvor Hardin
The quality of an estimator is not measured using just bias or variance. It is a combination of both, i.e. Root Mean Squared error RMS Error = bias^2 + variance so if you can trade a small amount of bias for a large amount of variance you are doing quite well. Add consistency (i.e. the bias of the estimator goes to zero as the sample size increases) to that and you are golden. This is what lies at the heart of Photon Mapping. The photon map based estimate of the scene radiance is biased but consistent quantity and has a much smaller variance as compared to a pure monte carlo path traced version. As for final gather, you can make yet another bias for variance tradeoff using irradiance caches (greg ward) and irradiance gradients. This can speed up a typical final gather routine by more than an order of magnitude. As for depth of field he will have to simulate a finite aperture in anycase, so I do not see how this is a disadvantage of photon mapping. The direct lighting from spherical sources is not a big deal I do not see what complication you are referring to there. So in summary, the goal of an unbiased simulation is nice, but since what you are really after is an accurate image, for the same amount of computational effort photon mapping will give you better results.
Given that this was an exercise, I'm somewhat tempted to ask whether you aren't counting the JVM startup time on top of a very short problem instance.
C: 30 seconds, Java: 8 hours. No kidding. These were running on the same laboratory machines, the Java programs using the latest Sun JVM at the time. The exercise involved calculating some statistics from a simulation run for one virtual year. Most of the people doing the Java version had to leave it running overnight. Besides, JVM startup time is a performance issue, you can't just dismiss it out of hand.
That, or whether the exercise involved lots of dynamic creation of objects (in which case your program was by definition not doing the same thing as the Java implementations). The exercise you mention sort of sounds like that.
In the Java version, each event was an object. In C each event was a malloc'd struct. These are analagous in that they are the correct way to solve the problem in that particular language. The fact that object creation and destruction is a lengthy process is an important reason why Java is so slow. It's an object-oriented language, its raison d'etre is to create and destroy objects. If it can't do that fast enough then what's the point in it being able to do anything else with speed?
I know at this point the Java apologists will say "just use a pool of pre-created Objects, that'll speed it up". This misses the point in two ways: (1) Java was supposed to obviate the need to perform manual memory handling. Now I have to not only remember to de-allocate my objects when I am done with them, I have to write a Pool class within which to store my not-currently-in-use objects?! (2) I can do that in C, too, if I really want to, and it'll make the C version faster again.
I am well aware of the forte of Haskell and its different interpreter implementations. To the degree that I wouldn't be surprised if the Haskell implementation did better than your C implementation.
They didn't. I was told they took around 10 minutes to run, but I never actually saw them working. When I produced results from a simulation that ran for 1000 years (an overnight run for my C program), the professor was amazed. But yes, in my experience interpreted Haskell is very fast, which makes Java look even more foolish.
I've done this four times now; written a C equivalent of a Java program and found it at least one, often several orders of magnitude faster. I've yet once to be shown a Java program that is significantly faster than its C equivalent.
This does not mean that Java should not be used as a programming language. The language features (especially those in version 1.5) are useful for working in a collaborative environment with mediocre programmers, and where the raw performance of the application is not critical (such as a GUI front-end to a server application). Just don't bullshit about it being faster than C, or even "fast" and expect me to believe it.
I'm willing to be proven wrong, but until someone can actually show me proof that it can be done, I'm not going to believe the hype, hand-waving and hot air.
What's worse about swapping a large* Java program out and back in is when the garbage collector next runs, usually very soon after it gains focus (i.e. when you want to use the program again), it touches near enough every damn page the program is using, checking for objects to cull. This causes the OS to swap the *whole* program back into memory and consequently a whole load of thrashing.
;-) ]
[* is there any other kind?
Ray tracing just means following the path of a ray for simulation. It's not just for graphics. However in graphics it normally refers to just following a) at least one "eye" ray from each pixel in the camera, b) reflective rays, and c) direct illumination rays to the lights.
This gives the simple raytraced scene that you are probably thinking of.
What are missing are diffuse rays: light from other objects that aren't lights and aren't in the direction of a mirror bounce. Gathering all the light is known as the global illumination problem.
There's lots of way to handle this including radiosity patches and monte carlo ray tracing.
It looks like this guy is doing some type of monte carlo ray tracing while properly taking into account wavelengths (i.e. not RGB but lamba) and many bounces instead of just direct or only X levels. He may also be tracing from the light out instead of the camera back.
I'm not sure if he's doing any stratified sampling or bundling of photons but it would help speed things up if he's not.
I have received about 900 emails today. I will be posting the program on http://www.cpjava.net in a day or two so everyone can download it. Not to self: don't post email address :-(
Inside every complex program is a simple solution trying to get out.