ATI's Stream Computing on the Way
SQLGuru writes to tell us that ATI has announced plans to release a new graphics product that could provide a shake-up for high performance computing. From the article: "ATI has invited reporters to a Sept. 29 event in San Francisco at which it will reveal 'a new class of processing known as Stream Computing.' The company has refused to divulge much more about the event other than the vague 'stream computing' reference. The Register, however, has learned that a product called FireStream will likely be the star of the show. FireStream product marks ATI's most concerted effort to date in the world of GPGPUs or general purpose graphics processor units. Ignore the acronym hell for a moment because this gear is simple to understand. GPGPU backers just want to take graphics chips from the likes of ATI and Nvidia and tweak them to handle software that normally runs on mainstream server and desktop processors."
So sick of x86. Look at all the cool stuff the graphics card makers are coming up with. Intel needs to buy NVidia to get real innovation done. I'm sure they have cool stuff cooking up, though. Let's get engineers going and let's get innovating!
Not "Steam Computing"...
Perhaps they're following Valve's lead and are introducing 'episodic' computing.
For most uses you don't need fast 3d graphics anyway. You just need the features. Or want them. Intel graphics will be enough to give Linux users their cutesy Xgl desktop with shadows and warping and blah blah blah and that will be enough to sell a bunch of intel cards solely because they have open source drivers. In fact my goal in future servers will be to get intel integrated graphics so that I can have the open source drivers.
On a desktop I don't care so much about whether drivers are open source or not. On a server, I care very much. I can use another desktop or desktop OS and get the same functionality, but I might not be able to conveniently jump over to another server.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
The point of a thing like this is to ship data in bulk to the VRAM attached to the GPU. Then have the GPU grind away on that data using the large memory bandwidth available on the adapter. Then once finished pull the data back off the adapter. Also note that PCIe is much much better then any prior PCI/AGP bus for feeding this type of thing.
Nah, Think about it a bit more. AMD buys ATI, AMD has hypertransport, ATI has chips capable of running 48 specialised threads *alongside* your normal cpu, admittedly initially PCIe (which isnt that shabby), but eventually they *have* to put it on hypertransport, with direct access to ram yada yada yada. I can see database servers LOVING this, and scientific visualisation software, 3d renderfarms etc. etc.
(Full disclosure: I work for a major manufacturer of 3-D accelerators.)
There's lots of good sites that talk about GPGPU. Wikipedia has an okay article on the subject as well, and NVIDIA has a primer (PDF) on the subject. But the summary of this article is a bit overly broad.
GPGPU isn't about moving arbitrary processing to the GPU, rather it's about moving specific, computationally expensive computing to the massively parallel GPU.
Effectively, the core idea of GPGPU solutions is that you compute 256x256 (or another granularity) of solutions entirely in one pass.
NVIDIA has several examples on their website, specifically the GPGPU Disease and GPGPU Fluid samples. The Mandelbrot computation they have there could also be considered an example. (More samples here).
GPGPU has already been utilized to perform very fast (comparable to the CPU) FFTs. In an article in GPU Gems 2 (a very good book if you're interested in doing GPGPU work), they indicate that a 1.8x speedup can be had over performing FFTs on the CPU. I've heard that there are now significantly faster implementations as well.
I currently have no clever signature witicism to add here.
In the original PC, the VGA interface gave the CPU a direct window into the video memory. Your CPU was your GPU as well - the only thing the graphics card did was convert the raster of bytes in a certain location to a signal recognizable by the monitor. As such, the hardware wasn't optimized for the kinds of operations that would become typical in the games that followed. So video card manufacturers began a mitigation strategy which involved moving the computationally complex parts of rendering off to the video card, where the onboard processor could render much more quickly and more efficiently than the CPU itself. The drawback of this approach was that to take full advantage of your video hardware you had to run a certain buggy, unstable, and rather insecure operating system. Typically, the drivers were written only for Windows. Reinstalling Windows became a semi-annual ritual for serious gamers.
But, if ATI is successful in standardizing the GPGPU architecture, we may be able to take advantages of the video hardware on platforms other than Windows. While Linux has typical suffered a dearth of FPS games because of the lack of good hardware rendering support in the past, this has the potential for Linux to become the next serious gamer's platform.
Which is a good thing, IMHO.
The society for a thought-free internet welcomes you.
Pond Computing, Lake Computing or Ocean Computing?
They lack gravitational potential energy. Yeah, you can try to play around with extracting energy from the temperature gradiants of a lake or ocean (ponds don't have any worth worrying about), but it's just easier to stick a turbine in a stream to make the computer go; and unlike my heavy piston on a rope floating in a leaky sand filled cylinder engines the Sun carries the water back up to the upper reseviour for you.
KFG
It sounds to me that it's not entirely general purpose, just a recognizing of the fact that optimizing for the sorts of operations that graphics have benefitted from can easily be shifted to some other specific applications.
So that there are, for example, some specific common database operations that could be significantly more efficient with some optimized hardware. It's just that there's not necessarily a big enough market to design, test, produce, and sell cards designed just for that and make a profit. So instead, you just sort of piggyback all of that on top of all the existing graphics card technology, and you get most of the benefit for a fraction of the cost.
Basically, no one is going to start a company that produces "database" cards and stay in business. But if ATI can squeeze that functionality into their next generation of graphics cards, they'll probably sell a few more units of something they were going to produce anyways, and the database admins of the world might be a little happier.
One time I threw a brick at a duck.
Oh really? Then perhaps you'd care to clue the rest of us in? I see very little impact from the x86's VLE instruction set. Only if you make assumptions about the underlying core based on the instruction set (which would not be a wise thing to do) could I see that VLE as an issue.
Javascript + Nintendo DSi = DSiCade
Stream processing is not new. There's been academic projects working on massively parallel systems for decades. One particular project I know of is UCSC's Kestrel processor, a 512-way 8-bit stream processor.In the late 90s this thing blew high-end desktops out of the water for linear processing tasks like image convolution and at a fraction of the power.
The HyperTransport protocol only calls for 8GB/s across the bus (I can't recall if thats one direction, or if thats the bidirectional speed). Which means that with SLi (4GB/s bidirectional on each PCIE 16X slot) you're already using all the HyperTransport space. So, I'm wondering, exactly how much more power do people expect to get from these cards. They have half the bandwidth of your CPU, so if your NB is your bottleneck, it's gonna throttle performence even more. The nice thing about AMD CPUs over Intel (flame wars, START!) is that the memory controller is integrated onto the CPU die, which means that the AMD CPU doesn't have to use the FSB (NB) to get to RAM, which means you could use the entire 8GB/s of HT for these purposes. So, if I want to spend $200 on a new card, how much extra performence would I get? Come on ATi, I'm talking a straight comparison.
Example: $200 card will get me the equiv. of a dual core 2.5Ghz 64bit CPU.
Will the CPU be able to access the superfast DDR3/4 of the video card? Or is that reserved for the video card's calculations, whether it be for gaming or processing.
For the love of all things good, please make this thing quiet!