River Trail — Intel's Parallel JavaScript
mikejuk writes "Intel has just announced River Trail, an extension of JavaScript that brings parallel programming into the browser. The code looks like JavaScript and it works with HTML5, including Canvas and WebGL, so 2D and 3D graphics are easy. A demo video shows an in-browser simulation going from 3 to 45 fps and using all eight cores of the processor. This is the sort of performance needed if 3D in-browser games are going to be practical. You can download River Trail as a Firefox add-on and start coding now. Who needs native code?"
CPUs
the concept of assigning more than 1 cores to a single thread
Computing doesn't work that way. At least not with any meaningful speed increase. That may decrease power usage, but you'll still need "parallel programming shit" to make proper use of parallel processing hardware.
People who want their Windows 8 tablet to have a real world battery life longer than two hours?
So, what - 4 or 5 people?
#DeleteChrome
Since JS is normally single-threaded, I'm guessing that the one-core scenario is spending more than half its time on things other than the simulation. Additional cores can be dedicated entirely to the simulation. Under those circumstances, 15x speedup isn't the least bit surprising.
People who want their Windows 8 tablet to have a real world battery life longer than two hours?
So, what - 4 or 5 people?
I think you overestimate the sales potential for windows8 tablets.
Not entirely. One of the features of Sun's cancelled Rock CPU was something they called Thread Scout. The idea was to run one core ahead of another, skipping most computation, to pre-fault memory addresses. This ensured that data was in cache when it was needed. There was also an idea to use multiple cores to extend the superscalar concept, so when you encountered a branch one core took each potential path and you discarded the wrong one. A lot of GPUs used to do this, but no general purpose CPUs (that I'm aware of, although ARM and Itanium do something similar with their predicated instructions).
You're right that you won't get the full benefit of writing proper concurrent code, but you will get some.
I am TheRaven on Soylent News
It's not totally unbelievable, when you consider the fact that it's using WebGL. Doubling the speed at which the CPU prepares data for the GPU to render can more than double the overall throughput.
I am TheRaven on Soylent News
Not entirely. One of the features of Sun's cancelled Rock CPU was something they called Thread Scout. The idea was to run one core ahead of another, skipping most computation, to pre-fault memory addresses.
That was done back in the days with the original 68000. They were put in tandem in some machines, and one processor ran slightly ahead of the other. If it hit a bus fault, the second 68000 was used to recover, as the original 68K could not recover normally from a bus fault. Obviously this was not for performance purposes, but rather for reliability, but it's amazingly similar.
The oh-ten could recover from bus faults, and the 020 had a full-scale (although external) MMU option, so the technique ceased to be used.
That means the animated ads can now suck up all of my CPU, rather than just one core's worth. I can't wait!
That is all.
Instead of get a 50$ graphics card and play Doom3 on it, we need now 8 cores CPU to play JavaScript games in the browser? That is the bright future we can look for with ChromeOS and "the browser is the OS" future?
http://www.mueller-public.de - My site http://www.anr-institute.com/ - Advanced Natural Research Institute
Why should an application decide the best way to split a load over multiple cpu cores? How does it know what else is going on in the OS to balance this load? Shouldn't the OS handle this behind the scenes?
Now all we need is a "sleep" function.
Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
No.
If half the work is unparallelizable then the max theoretical speedup is 2x.
This is a simple application of Amdahl's law:
speedup = 1 / ( (1-P) + (P/S) )
where P is the amount of the workload that is parallelizable and S is the number of cores.
speedup = 1 / ( (1-0.5) + (0.5/S) )
lim S-> infinity (speedup) is 1/ 0.5 = 2x
The likely reason the speedup appears superlinear here is that there are actually two speedups.
1.) Speedup from parallelizing on multiple threads. From looking at the usage graphs, this is probably about 4x.
2.) Speedup from vectorizing to use processor AVX extensions: This could be another 4x.
Total speedup: 16x.
A 16x speedup is totally believable for vectorizing and parallelizing a simple scientific simulation like the one shown in the video.
> Javascript is in IE8 on win xp.
Uh.... This is the same IE8 that doesn't have a JIT, right? Unlike every single browser actually shipping now?
Here's a relevant graph: http://ie.microsoft.com/testdrive/benchmarks/sunspider/default.html
It's a bit out of date, since all browsers have gotten faster since then, but it shows IE8 being about 18x slower than any modern browser on this particular benchmark. And this is a benchmark that hammers a lot on the VM (dates, regular expressions, etc), not the language itself.
On code that runs for longer than a few ms and is actually compute-intensive the difference between IE8 and any modern browser is even more pronounced.
Heck, at this point you can compile C code to JavaScript and then run it in some browsers and have it be only about 5x slower than the original C code. That's with (typed) arrays representing the C stack and heap and so forth...
I'd love to see your benchmark code, by the way. Or for you to rerun the benchmark in something that actually tries to run Javascript quickly, as opposed to IE8.