Using GPUs For General-Purpose Computing
Paul Tinsley writes "After seeing the press releases from both Nvidia and ATI announcing their next generation video card offerings, it got me to thinking about what else could be done with that raw processing power. These new cards weigh in with transistor counts of 220 and 160 million (respectively) with the P4 EE core at a count of 29 million. What could my video card be doing for me while I am not playing the latest 3d games? A quick search brought me to some preliminary work done at the University of Washington with a GeForce4 TI 4600 pitted against a 1.5GHz P4. My Favorite excerpt from the paper:
'For a 1500x1500 matrix, the GPU outperforms the CPU by a factor of 3.2.' A PDF of the paper is available here."
Now I finally have a use for the 20 Voodoo 2 cards I have in a box in the basement. Now I can have my very own supercomputer. I just need some six pci slot motherboards.... Instant cluster!
Humor from a Genetically Molested Mind
Intel's been telling me for years that I need faster hardware from THEM to get the job done...
You mean........ they were lying?!?!?
CRAP!
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
http://developers.slashdot.org/article.pl?sid=03/1 2/21/169200&mode=thread&tid=152&tid=18 5
Here's a HTML version of the PDF, thanks to Google.
At my work place, I'm looking into using the GPUs to do video analysis. Things like cut-scene detection, generating multi-resolution versions of a video frame, applying video effects and other proprietary technologies that were previously done in CPU. The combination of pixel shaders and floating-point buffers really make GPUs a Super-SIMD machine if you know how to exploit it.
www.rexguo.com - Technologist + Designer
The GPU are very fast ... at performing vector and matrix calculations. This is the whole point. If general computing CPUs were capable of doing vector or matrix calcs very efficiently, we would probably not have GPUs.
The Pentium 4 EE actually has 178 million transistors, which puts it in between ATI's and NVIDIA's latest.
In all of this, keep in mind that there's computing and there's computing...the kind of computing power in a GPU is excellent for doing the same numeric computation to every element of a large vector or matrix, not so much for branchy decisiony type things like walking a binary tree. You wouldn't want to run a database on something structured like a GPU (or an old vector-processing Cray), but something like a simulation of weather or molecular modeliing could be perfect for it.
The similarities of a GPU to a vector processing system bring up an interesting possibility...could Fortran see a renaissance for writing shader programs?
General-purpose computation using graphics hardware has been a significant topic of study for the last few years. Pointers to a lot of papers and discussion on the subject are available at: www.gpgpu.org
No, it's like using your pop-up camper for storage space when you're using it on holidays.
Two words: virtual pr0n
Show me on the doll where his noodly appendage touched you.
Does anybody know of pointers to papers/research pertaining to using GPUs to perform digital signal processing for, say, real-time audio? Replies would be much appreciated.
Here is a link at Adobe where you can turn any PDF into HTML.
Is a course being offered at caltech since last summer on using gpus for numerical work. Course page is here.
:wq
"Utilize the sheer computing power of your video card!"
New market blitz, hmmmm.
SETI ports their code, and within five days their average completed work units increase 1000 fold. 13 hours later, they have evidence of intelligent life at 30000 locations within one degree.
Microsoft gets the hint, and comes out with a brilliant plan to utilize GPUs to speed up their OS and add bells and whistles to their UI.
And, once again, Apple and Quartz Extreme is ignored.
Before you get excited just remember how asymmetric the APG bus is. Those GPUs will be at much better use when we get them as 64bit pci cards.
The whole point of graphic cards is that they have a dedicated purpose. Using the cards for anything that is general purpose is like using a motorcycle to tow a pop-up camper.
What's relevant is that to the processor on a graphics card, its dedicated purpose is simply a bunch of logic. There's no dedicated "this must be used for pixels only, all else is waste" logic inherent in the system. there are MANY purposes for which the same/similar logic that applies in generating 3D imagery can be used, and that seems the purpose of this paper. Run THOSE type operations on the GPU. Some things they won't be able to do well no doubt - but those they can, they can do extremely well.
What's interesting with new video cards it's their memory capacity, 128 or 256 MB and that this memory is accessible on some new cards at 900 MHz with a data path of 256 bit (which is a lot faster than a CPU with DDR 400 installed).
All that processing power, and the latest games still run at about 22 frames per second, if that.
The CPU can do six billion instructions a second, the GPU can do 18 billion, and every last cycle is being used to stuff a 40MB texture into memory faster. What a waste. Yeah, the walls are even more green and slimy. Whoop-de-fucking-do.
Would it be great if all that processing power could be used for something other than yet-another-graphics-demo?
Like, maybe some new and innovative gameplay?
Business isn't willing to pay for products, innovation and careers, so we get brands, mortgage commercials and layoffs.
At my work we do audio stuff. It would be really neat if I could do some of the more complicated audio analysis (FFT etc) that requires lots of vector math using the video cards gpu. There is probably even some way you could sync the timing for multimedia stuff.
I know nothing about CPU design though
to all our compatibility woes. I keep hearing about how much faster G5's and Alpha's are than x86's, but it doesn't really matter if it won't run the apps I want. Now that processors are so cheap, why not just throw an x86 in for compatibity and then start over with a better design? Kinda like what the PS2 does so it can play PS1 games (I think).
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
Creating a way to use the specialize GPUs for vector processing that is not graphics related is ingenious. Like a lot of great ideas, it is sooo obvious AFTER you see some one else do it.
Don't miss the point that this is not intended for general purpose computing. Don't port OoO to the graphics chip.
Where it is huge is in signal processing. FPGAs have begun replacing even the G4s in this area recently because of the huge gains in speed vs. power consumption an FPGA affords. However, FPGAs are not bought and used as is, and end up costing a significant amount (of development time/money) to become useful. Being able to use these commodity GPUs for vector processing creates a very desirable price/processing power/power consumption option. If I were nVIDIA or ATI, I would be shoveling these guys money to continue their work.
I am living proof of the Peter Principle
If you have a matrix solver, there is no telling what you can do. And i remember, these papers show that the speed is faster than the matrix calculations of the same stuff using the CPU.
# Linear Algebra Operators for GPU Implementation of Numerical Algorithms
Jens Krüger, Rüdiger Westermann
# Sparse Matrix Solvers on the GPU: Conjugate Gradients and Multigrid
Jeff Bolz, Ian Farmer, Eitan Grinspun, Peter Schröder
# Nonlinear Optimization Framework for Image-Based Modeling on Programmable Graphics Hardware
Karl E. Hillesland, Sergey Molinov, Radek Grzeszczuk
http://www.gpgpu.org/ is a great resource for general purpose graphics processor usage.
some one else posted this...
www.gpgpu.org
Website on this topic (Score:0)
by Anonymous Coward on Sunday May 09, @01:57AM (#9098550)
General-purpose computation using graphics hardware has been a significant topic of study for the last few years. Pointers to a lot of papers and discussion on the subject are available at: www.gpgpu.org [gpgpu.org]
Apple's Newton had no CPU, only a GPU that was more than adequate.
Ideas like these are good in general. I'd like to see the industry move away from the CPU-as-chief status quo. Amigas were years ahead of their time in large part because the emphasis wasn't as much on central processing. The CPU did only what it was supposed to do -- hand out instructions to the gfx and audio subsystems.
Hardly using a "motorcycle to tow a pop-up camper." If anything, the conventional wisdom is, "when all you have is a hammer, everything looks like a nail."
BrookGPU
from the BrookGPU website...
As the programmability and performance of modern GPUs continues to increase, many researchers are looking to graphics hardware to solve problems previously performed on general purpose CPUs. In many cases, performing general purpose computation on graphics hardware can provide a significant advantage over implementations on traditional CPUs. However, if GPUs are to become a powerful processing resource, it is important to establish the correct abstraction of the hardware; this will encourage efficient application design as well as an optimizable interface for hardware designers.
From what I understand this project it aimed at making an abstraction layer for GUP hardware so writing code to run on it is easier and standardsied.
Many of the problems stated in using a GPU for non-graphics tasks would be implicitly solved if the GPU and CPU shared memory. While this would slightly slow down the GPU's memory access, in 3 years, I don't think that would be an issue. Especially compared to the benefits of having only one memory pool.
...Several indies and companies figure out how to use the powerful GPU's in an efficient manner that would benefit everyone who uses computers on a daily basis and improves the usefulness of the computer making it the best thing in the world again then some greedy bastard comes along flashing his granted patent by the U.S. Patent Office which makes us all screwed...
;)
Ohh well the idea was good while it lasted.
This space is not for rent.
The whole point of graphic cards is that they have a dedicated purpose. Using the cards for anything that is general purpose is like using a motorcycle to tow a pop-up camper.
Or using the intel 8086 drive controler as a general purpose cpu?
a beowulf cluster of them.
seriously, we have a 16 node beowulf cluster and each node has an unnecessarily good graphics card in them. a lot of the calculations are matrix-based e.g. several variables each 1xthousands (1D) or hundredsxhundreds (2D).
how feasible and worthwhile do you think it would be to tap into the extra processing power?
...will someone finally port john the ripper to a new video card's graphical pipeline? :)
Anybody can see that this is all coming together someday. What is needed is a way to change the circuitry to approach whatever n-bit problem you need solved. Graphics is around 80h bit. Sound might be sixteen bit.
The future should be more elegant and flexible. Drop your precision and instantly gain speed. We'll wonder why we dealt with graphics drivers and other such complications.
-I am an elective eunuch.
There is however one thing to keep in mind. Presently our GPU's may have the headroom to play with, but with Apple's Quartz, and Microsoft's Longhorn, let alone what's coming with X. That headroom may disappear, and our video cards will have to go back to being video cards.
I'm curious how GPUs stack up against the Altivec engine in G4/G5s.
I thought this looked familiar:
1 /169200.shtml?tid=152&tid=185
http://developers.slashdot.org/developers/03/12/2
At least, I would imagine most of the comments would be the same or similar....
Apple, innovative as always, is already making headway. Not in the way that paper describes, per se, but in other ways.
They, of course, designed the first OS that takes advantage of the user's GPU in situations other than CAD and games; in situations for general purpose computing. The OS uses the GPU to render the UI; obvious sounding at first glance, but revolutionary in practice.
Small steps though they may be, Apple is, as seemingly always is the case, ahead of the game.
Using GPUs For General-Purpose Computing
I'm glad that finally they started to use the General-Purpose Unit. What took them so long?
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
In Soviet Russia CPU outperforms GPU by factor 3.2x.
;-)
Remember the co-processors? Well, actually I don't (I'm a tad to young). But I know about them.
Maybe it's time to start making co-processing add-on cards for advanced operations such as matrix mults and other operations that can be done in parallell on a low level. Add to that a couple of hundred megs of RAM and you have a neat little helper when raytracing etc. You could easily emulate the cards if you didn't have them (or needed them). The branchy nature of the program itself would not affect the performance of the co-processor since it should only be used for calculations.
I for one would like to see this.
Dude, you obviously have never tried to sleep in a motorcycle.
KFG
Do any of the video chip manufacturers make free and complete documentation available for their GPUs? Everything that I have read in the past has said that they are encumbered with NDAs and claims of trade secrets. I'd prefer not to waste my time dealing with companies that treat their customers as potential enemies.
Mea navis aericumbens anguillis abundat
Some dude wrote Frogger almost entirely in pixel shaders. http://www.beyond3d.com/articles/shadercomp/result s/ (2nd from the bottom).
Forget thrust, drag, lift and weight. Airplanes fly because of money.
... but hopefully the 'overrated' mods won't act as double negatives....
With all the talk about how the GPU's are so great at Matrix calculations, the question should not be "What can my GPU do when it's otherwise 'Idle'?", but "What is the Matrix?"
-Rusty
You never know...
Perhaps offloading the CPU to the GPU is the wrong way to look at things? With the apparently imminent arrival of commodity (low power) multi-CPU chips, maybe we should be considering what we need to add to perform graphics more efficiently (ala MMX et al)?
While it's true that general purpose hardware will never perform as well as or as efficiently as a design specifically targeted to the task (or at least it better not), it is also equally as true that eventually general purpose/commodity hardware will achieve a price-performance point where it is more than "good enough" for majority.
thought you said you were out of here? Liar...
sigh.. no woman at all
I would what seti could do by the extra cycles in parallel with the CPU. Is it possible to get 2x or 3x the crunching of data for seti clients?
I think most dinks would be decimated by a .22 pistol blast. A .22 is nothing to scoff at really.
From a design standpoint, I can imagine a GPU that donates its power to the CPU would be a nightmare. It violates the fundamental tenet that everything should do one thing and do it well. OTOH, that tenet focuses on simplicity and maintainability over performance. Is such a tradeoff worth it?
There's some good stuff in there.
However, it seems a few organisations have actually beaten us to it.
Apple, for example, uses the 3d aspect of the GPU to accelerate its 2d compositing system with quartz extreme. Microsoft, as usual, announced the feature after Apple shipped it, and with any luck Windows users might have it by 2007
-- james
Long live the Transputer. Or I'll settle for just a lowly DSP, hiding out in a sound-card, or hard drive.
"Small steps though they may be, Apple is, as seemingly always is the case, ahead of the game."
The guys at Next would agree with you.
It is when you are holding a .45 revolver...
Please flee in terror in an orderly manner.
There have been studies of the 3dNOW! capabilities of AMD processors in just such a capacity.
I am a novice in a lot of these discussions so I don't post much. Let me see if I understand this:
The graphics card has a lot of unused computing power, nearly equal to the main processor chip in the computer if not more, that is not being used when there is no game or video being played, right?
Is there no way to tap into this power?
Perhaps it could be used for the main display on the computer (I think you guys call it GUI?)?
What else could it be used for?
Could Linux be modified to make use of this power?
Just a know nothing, nobody with questions.
J
Now I finally understand that acronym: General purpose unit!
Now I can make a PDF from the HTML version of the PDF document that grandparent hunted down on Google!
Why do we want to do this again?
Information doesn't want to be anthropomorphized anymore.
These applications are not likely to generate or process data at such a rate that the slow AGP read speed will matter that much, if at all.
The Internet's nature is peer to peer - 20050301_cs_profs.pdf
I have a dedicated video processing card that uses a sparc core as part of one of it's chips. The rest is basically a DSP, with a video mixer/ switcher on board. So basically video offloading has been around for years, but only recently affordable in the prosumer space. No need to hack a GPU to get results.
That would often waste resources. If you needed a 3x3 multiply and the hardware only supported 4x4, it ends up doing roughly twice the work needed.
Well they already make DSP cards for audio processing. Simply do a google(TM) search for "DSP card" and you will get several vendors.
I can't imagine it would take a whole lot to hack them for just their processing power outside of audio applications.
hi!
with all this fft & wavelet talk, does anyone know of a gpu version of oggenc?
When I say oh shut the fuck up.
Sorry for the flames, but seriously, I get so damn sick of all the "all new games suck" whiners. Look, there are legit reasons to want new technology. It is nice to have better graphics, more realistic sound, etc. It is NICE to have game that looks and sounds more like reality. Yes, that doesn't make the game great, but that doesn't mean it's worthless.
What's more, don't pretend like all modern games suck while old games ruled. That's a bunch of bullshit. Sure, there are plenty of modern games that suck, but guess what? There are tons of old games that suck too. Thing is, you just tend to forget about them. You remember the greats that you enjoyed or heard about, the ones that helped shape gaming today. You forget all the utter shit that was released, just as is released today.
So get off it. If you don't like nice graphics, fine. Stick with old games, no one is forcing you to upgrade. But don't pretend like there is no reason to want better graphics in games.
I did a paper on the topic of general-purpose GPU programming for my parallel computing course just this last semester here, interestingly enough. I believe our research indicated that even a single PCI card was so badly throttled by the bus throughput that it was basically useless. AGP does a lot better taking data in, but it's still pretty costly sending data back to the CPU. I have a feeling your proposed setup will be a whole lot more feasible if/when PCI Express becomes mainstream.
Remember the story about PS2's being used in Iraqi WMDs? No doubt the next "outlaw state" will be accused of using GeForce Ti4600's to manage fast breeder reactors.
When I am king, you will be first against the wall.
1) Patent the idea of using spare GPU cycles to do non-graphic related computational work. 2) ????? 3) Licensing the idea to ATi and nVidia 4) Profit!
What I remember about co-processing cards and "intelligent peripheral cards" (like raid controllers or network cards with an onboard processor) is this:
There is a certain overhead because a communications protocol is to be established between the main processor and the co-processor. For simple tasks the main processor often stops and waits for the co-processor to complete the task and retrieves the results. For more complicated tasks, the main processor continues but later an interrupt occurs that the main processor must service.
You must be very careful or the extra overhead of this communication makes the execution of the task slower than without the co-processor. This is certainly going to happen at some time in the future, when you increase central processor power all the time but keep using the same co-processor.
For example, your matrix co-processor needs to be fed the matrix data, start working, and tell it is finished. Your performance would not only be limited by the processor speed, but also by the bus transfer rate, and by the impact those fast bus transfers have on the CPU-memory bandwidth available and the on-CPU cache validity.
When you are unlucky, the next CPU you buy is faster in performing the task itself.
With Dual Core CPU's going to be the norm, why not a Dual Core GPU for even faster gfx cards? With everyone wanting 16x antialiasing at 1600x1200 to get over 100fps, its gonna take some very powerful GPU's (or some dual cores).
Even with the ATI 800XT, 1600x1200 can dip below 30FPS with AA/AF on higher settings. Still a ways to go for that full virtual reality look.
I've been thinking about using the GPU for audio DSP work for some time, even got to a point where I could transform some signal by "rendering" it into a texture (in a simple way, I could mix two sounds using the alpha as factor).
The problem is that these cards are made to be "write only" and that basicaly fetching back anything from them is *very* slow, which makes them totaly useless for the purpose, since you *kmow* the results are there, but you can't fetch them in an usefull/fast maneer.
I wonder if it's deliberate, to sell the "pro" cards they use for the rendering farms
Then there is a quantum change in CPU design, and CPUs catch up with GPUs (at least on paper). The someone finds out something new and cool they can do by pushing these CPUs to the limit, but only just. Then, the scientists and engineers decide it's useful and someone makes expensive dedicated hardware. Then the games find out about it and buy less expensive but $1000 hardware. Them there is a period of frenzied competition and it comes down to $100.
Then, someone goes and thinks of something new and cool to do with this cheap high-powered hardware....
Stick Men
Could this mean that we could evolve a "Back-door" way to dump the disgusting x86 achitecture? Think about it - we devise a universal OS way in both Linux/Windows of allocating tasks/threads to "external" RISC processors. At some stage, these can be the "main" processors, able to run the host/boot-up/old code under emulation. Then, dump the 386!
Think about it..
"You lied to me! There is a Swansea!"
QE is cool, but it doesn't do anything similar at all to what they're talking about here. FFTs on an NV30 are only incidentally related to texture mapping window contents. Check out gpgpu.org or BrookGPU. In a sense, the idea is to treat modern graphics hardware as the next step beyond SIMD instruction sets. Incidentally, e17 exploited (hardware) GL rendering of 2D graphics via evas a bit before Apple put that into OS X.
This concept was being used back in 1988. The Commodore 64 (1mhz 6510, a 6502 like micro processor) had a peripheral 5.25 disk drive called the 1541, which itself had a 1mhz 6510 cpu in it, connected via. a serial link.
It became common practice to introduce fast loaders: these were partially resident in the C64, and also in the 1541: effectively replacing the 1541's limited firmware.
However, demo programmers figured out how to utilise the 1541: one particular demo involved uploading program to the 1541 at start, then upon ever screen rewrite, uploading vectors to the 1541, which the 1541 would perform calculations in parallel with the C64, then at the end of the screen, the C64 fetch the results from the 1541, and incorporate them into the next screen frame.
Equally, GPU provides similar capability if so used.
You're absolutely correct that these "game snobs" are looking at the past through rose-colored graphics, forgetting all of the stinkers of yesteryear. However, it's not just games where this applies. How many times have you heard people complain about how bad movies are now, or music, or books? It's exactly the same phenomenon. When your grandfather tells you how much better things were "back in the day", it's for exactly the same reason. He's looking back at all the good things, while ignoring all of the bad.
Face it, everything mostly sucks. It always has, and it always will. There will always be some gems that really stand out, and those will be what are remembered when people fondly look back on "the old days". Get over it.
Please mod parent up: +5, Funny!
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
They say the GPU outperforms the CPU, but they do not test it fairly: They did not use SIMD instruction sets such as SSE for the CPU code, but coded it in "straight C++". If they had, the CPU code would probably have run faster than the GPU version (at least for the matrix multiplication.)
"Microsoft, as usual, announced the feature after Apple shipped it"
God I'm tired of hearing that phrase over and over again when 95% of the time it's just because Apple can control the hardware and it would be a total disaster if MS included a technology as fast as they do...
My Sig: SEGV
Math co processor boards would be great, buy still quite fixed function.
It would be much more efficient if you would implement an co processor with an FPGA. First programming the FPGA what functions to execute. And then feeding the data to it, when the calculation is completed you just reprogram it to become whatever you want.
This way you would not have an math only board, but a board that could perform many many functions. You just need to write algorithms to exploit them.
Are there interrupts? Available to userland?
The 'acceleration' layer (DirectX, Xv?) is not even available to the programmers. The programmer requests from DirectX or SDL to draw a polygon. Then DirectX or SDL invoke acceleration features of the card. But we do not have direct access to those features. They are not even documented.
Will the kernel provide those facilities?
Because it would be stupid to go through SDL to perform FFT with the video card's capabilities.
I think the real reason Apple comes out with newer and bette technology is because they have to fight for their user base. After all, if Apple's products were the same as Microsoft's, who would care?
Microsoft can afford to be lazy with their products, they make money either way. I don't think that will last forever though. Sometimes they do try hard, NT for example, but then they pile a bunch of poorly designed stuff to go on top of it and that ruins it. If you can, check out OS X's directory structure, it's beautiful. Now compare that to Window's cryptic system...
"Microsoft, as usual, announced the feature after Apple shipped it"
"God I'm tired of hearing that phrase over and over again when 95% of the time it's just because Apple can control the hardware and it would be a total disaster if MS included a technology as fast as they do..."
I didn't say why it happens, I just said that it happens (ie MS announces the product after Apple). The original comment stands.
Wouldn't it be a nice experiment to use multiple next-generation PCI-X "graphic" cards on one system as the ultimate set of matrix co-processors? You could create your mini-cluster inside one box! Anyone seen or tried this yet?
However I do know that a lot of people had been wondering about this for a while, could it be done, and was it worth attempting, so now we know. Maybe we shall soon see PCI cards containing an array of GPUs, I imagine the cooling arrangements will be quite interesting!
There are other things which are faster than a typical CPU, are not some of the processors in games machines 128-bit? Again, you could in theory put some of these together as a co-processor of some sort.
This was a good piece of work technically, but it says something about society that the fastest mass-produced processors, whether for GPUs or games consoles, exist because people want a higher frame rate in Quake. I can't think of any professional application that needs really fast graphics output, but many that could use faster processing. So why can't Intel and AMD stop putting everything in the one CPU (multiple CPUs with one memory are not really much better), and make co-processors again, which will do fast matrix operations on very large arrays, etc, for those who need them? The ultimate horror of the one CPU philosophy was the winmodem and winprinter, both ridiculous. Silicon is in fact quite cheap, as Nvidia have proved, people's time while they wait for long calculations to finish is not.
Maybe we are going to see an architectural change coming, I expect it will be supported by FOSS long before Longhorn, just like the AMD64.
Remember NVIDIA's Gelato, which was released a few weeks ago?
Gelato uses the GPU as a floating point processor, in addition to the CPU. I would still love to see a movie rendered in realtime with OpenGL, though.
What's really needed is to couple the GPU and CPU in such a way that the GPU actually runs a very low level O/S, like an L4Ka style kernel (http://l4ka.org/), and becomes "just another" MP resource.
Then, on top of this low level, actually runs the UI graphics driver and so on. Other tasks can also run, but ultimately the priority is given to the UI driver.
Then, the O/S on the CPU needs to be able to know generally how to distribute tasks across to the GPU. Fairly standard for a tightly coupled MP that has shared bus memory.
Why do I say this? Because the result is
(a) if you're using an especially high performance application, the GUI runs full throttle dedicated to rendering/etc and acts as per normal;
(b) if you're not, e.g. such as when running Office or Engineering other compute intensive tasks (e.g. recoding video without displaying the video), then the GPU is just another multi processor resource to soak up cycles.
Then, CPU/GPU is just a seamless computing resource. The fantastic benefit of this is that if the O/S is designed properly, then it could allow simply buying/plugging in additional PCI (well, PCI probably not good because of low speed, perhaps AGP?) cards that are simply "additonal processors" - then you get a relatively cheaper way of putting more MP into your machine.
I have programmed in assembly for 20 years, also done some assembly programming of DSPs and some chip design. What strikes me with this background is the strange stagnant state of CPU architecture development when you compare with other related fields, where GPUs is the latest addition.
Just look at the newsgroup news:comp.arch which allegedly is dedicated to computer architecture but in reality is about computer archaeology, discussing computers of yesteryear.
For instance, most DSPs include data pumps and zero overhead loops that would be useful in general purpose CPUs. You also have MACs (admittedly more specialised), max/min functions (often useful) and dataflow features (would definitely be nice).
Newer architectures feature (sea of) sub processors, multiple busses and pathways. Yet CPUs steadfastly stand still, MMX, SSE1/2/3 notwithstanding.
Many DSPs on sound cards are reprogrammable, why distributed computing projects like SETI and the like have not taken advantage of these is beyond me. SETI does not even use multithreading features of recent x86 CPUs.
Now the thrust seems to be in GPUs and from what I can see they are taking up more and more features and reconfugurability and reprogrammability. All this is very exciting and relevant if the next generation personal VR systems are to become a reality. VRML was way too slow to be of practical use and still did not have body- or head tracking or even eye tracking. All this requires a lot of 3D transformation to work seemlessly. Perhaps within a few years we could have semi-VR like the one shown in Minority Report.
I agree with many other posts here that massive computational powers is wasted on fancier textures in games of yesteryear. The killer application for more power is, I believe, in personal (mobile) virtual reality.
Remember the co-processors? Well, actually I don't (I'm a tad to young). But I know about them.
Dig deeper. 8087 FPU's were nice, though they ran hot enough to cook on, but the idea had existed for 15 or more years before they appeared. Try looking into the old DEC PDP-11 archives. There you'll find DEC's own "CIS" or "commercial instruction set", which was a set of boards (later a add on chip) that added string, character and BCD math instructions. DEC also had a FPU card set that implemented a 64-bit FPU out of AMD 2901 bit slice processors. Many low-budget not-quite-supercomputers were really add-on hardware boxes to a general purpose computers. Basicly add-on stunt boxes.
Dam... I'm too young to feel this old! Most of this stuff was in play when I was in grade school.
Temkin
But I learned not all PDFs open in Windows, like this -- click on the link "Clique aqui para obter o arquivo", which means "Click here to get the file".
That opens normally in Konqueror, but didn't open at work in IE, under Windows 2000 through a proxy, either with Adobe 5 or 6 (well, it opened once with W2K+IE+Adobe 6, on a Xeon 2.8GHz -- probably due to some OS or browser configuration).
It is still possible to save to disk on W2K and read with Adobe 5 from the disk file, though. But it shows something is making PDFs not so universally readable as one might previously think.
Could this be integrated with a compiler, so that the compiler could elect to use the GPU? That would be really cool:
gcc --with-gpu somebigprog.c
People in /. can't even understand your joke.
I knew most don't have a life, but now it seems many can't remember what having a life looks like.
Citing sources is of paramount importance in scientific discussions. This is not "I think I saw somewhere" stuff...
This guy is obviously a pro, or a clever dude, if he's still a kid. And he's got a 4. Deserved, IMHO.
fuck Adobe and fuck their shitty bloated PDF, on Windows i can choose Adobe Acrobat Reader or command line xpdf crap and thats it ! so much for open formats egh ?
thanks but id rather not read a PDF than use Adobes shite reader and its not 1980 so im not using command line, send me a jpg or text file or if you want web people to read it (this is the internet) HTML
According to http://www.cs.swan.ac.uk/~csneal/HPM/alpha.html...
G4 - 33 million
G5 - 52 million
And from IBM...
G3 (750cx) - 22 million
I swear by MacOS X. Although I use to swear *at* MacOS 9...
I'd imagine it goes in cycles as communication technologies change - for 1985-1995, the multi-custom-chip amiga design ran rings around the PC, then the PC caught up by brute force. Now may be the time for PCs to go multi-custom-chip for a while.
Dual heads = SETI for TeamSlashdot *and* TeamARS!
I know that was supposed to be a joke but please stop spreading lies. It makes you look unintelligent, and that joke hasn't been funny in years anyway.
Time makes more converts than reason
I've already rendered movies in OpenGL...
mplayer -vo gl dvd://1
:P
--Quentin
Just look at the matrix multiplication case. Look at the graph and see that 1000x1000 takes 30 seconds on CPU and 7 seconds on GPU. Let's translate it to Millions of operations per second: CPU -> 33 Mop/s, GPU -> 142 Mop/s Matrix multiplication has cubic complexity so for CPU: 1000 * 1000 * 1000 / 7 seconds / 1000000 = 33 Mop/s
Now think a while: 33 million operations on 1.5 GHz Pentium 4 with SSE (I assume there is no SSE2). Pentium 4 has fuse multiply-add unit which makes it do two ops per clock. So we get 3 billion ops per second peak performance! What they claim is that the CPU is 100 times slower for matrix multiply. That is unlikely. You can get 2/3 of peak on Pentium 4. Just look at ATLAS or FLAME projects. If you use one of these projects you can multiply 1000 matrix in half a second: 14 times faster than the quoted GPU.
Another thing is the floating point arithmetic. GPU uses 32-bit numbers (at most). This is too small for most scientific codes. CPU can do 64-bits. Also, if you use 32-bits on CPU it will be 4 times as fast as 64-bit (SSE extension). So in 32-bit mode, Pentium 4 is 28 times faster than the quoted GPU.
Finally, the length of the program. The reason matrix multiply was chosen is becuase it can be encoded in very short code - three simple loops. This fits well with 128-instruction vertex code length. You don't have to keep reloading the code. For more challenging codes it will exceed allowed vertex code length. The three loop matrix multiply implementation stresses memory bandwidth. And CPU has MB/s and GPU has GB/s. No wonder GPU wins. But I can guess that without making any tests.
Years ago AT&T made something called the pixel machine which was a bunch of DSPs in a box and in effect a cluster of GPUs. There are rumors that the spooks used them for DES cracking.
Touche. However, with the upcoming advances in bus speeds (read: PCI Express) and the available bandwidth to the PCI bus, we won't have to worry about latency when using a coprocessor type piece of hardware. There is room to grow with this new bus to almost outlandish amounts of bandwidth. Not a problem we'll run into any time soon.
Listen to my experimental-industrial-techno!
Here's a paper from Columbia University on using GPUs to accelerate cryptographic calculations.
Could those cycles be donated to the main CPU in some sort of SMP scheme?
Berto
pci express will certainly help speed up transfers in that kind of system too
Nvidia has already announced Gelato which uses the GPU to render regularly CPU intensive frames for video production. the link is here and the film industry apparently already uses this. I think that it requires the increase 2-way bus bandwidth that PCI-Express offers to be of any use but it's interesting nonetheless. I suspect that with PCI-Express MoBo's becoming more prevalent there will be a new market for arming PCs with non-function-specific (eg. not dedicated to graphic) co-processors that can assist with processing intensive tasks.
This is known as the cycle of reincarnation.
1. Is anyone except Apple trying to leverage the GPU for non-3D tasks? Apple has been doing Quartz Extreme for a while but I have not heard if anyone else is doing it.
2. Has anyone tried something similar to what Quartz Extreme does but for non-graphical tasks?
3. How come GPU makers are not trying to make a CPU by themselves?
Pedro
----
The Insomniac Coder
Having done a similar work for my final year project this year, I have some experience attempting general purpose computation on a GPU. The results that I recieved when comparing the CPU with the GPU were very different with many of the applications coming in at 7-15 times slower on the GPU. Further, I discovered some problems which I mention below:
! Matrix results
As in mentioned earlier in the report, the graphics pipeline does not support a branch instruction. So with a limitied number of assembly instructions that can be executed in each stage of the pipeline (either 128 or 256 in current cards), how is it possible for them to perform a calculation on a 1500x1500 matrix multiplication. To calculate a single result 1500 multiplications would need to take place and if they are really clever about how they encode the data into texture s to optimise access, they would need two texture accesses for even 4 multiplications. By my calculations that is 1875 instructions, where you can only do 128 or 256.
My tests found that using the Cg compiler provided by NVidia, that a matrix of size 26x26 could be multiplied before the unrolling of the for loop exceed the 256 limitation.
One aspect that my evaluation did not get to examine was the possiblity of reading partial results back from the framebuffer to the texture memory along with loading a slightly modified program to generate the next partial result. They don't mention if they used this strategy so I assume that they don't.
! Inclusion of a branch instruction
Even if a branch instruction were to be included into the vertex and fragment stages of the pipeline, it would cause serious timing issues. As student of Computer Science, I have been taught that the pipeline operates at the speed of the slowest stage and from designing simple pipelined ALUs, I see the logic behind it. However, if a branch instruction is included then the fragment processing stage could become the slowest as the pipeline stalls waiting for the fragment processor to output its information into the framebuffer. I believe it for this reason that the GPU designers specifically did not include a branch instruction.
! Accuracy
My work also found a serious accuracy issue with attempting compuation on the GPU. Firstly, the GPU hardware represents all number in the pipeline as floating point values. As many of you can probably guess, this brings up the ever present problem of 'floating point error'. The interface between GPU and CPU are traditionally 8-bit values. Once they are imported into the 32-bit floating point pipeline the representation has them falling between 0 and 1, meaning that these numbers must be scaled up to their intended representations (integers between 0 and 255 for example) before computation can begin. Combine these two necessary operations and what I saw was a serious accuracy issue where five of my nine results(in the 3x3 matrix) were one integer value out.
While I don't claim to be an expert on these matters, I do think there is the possiblity of using commodity graphics cards for general purpose computation. However, using hardware that is not designed for this purpose holds some serious constraints in my opinion. Anyone who cares to look at my work can find it here
well if you want to use DSP chips for processing power, just look here.
DSP PCI card 16 GFLOPS, example - 1 million point Arctan2 2.63 msec. i wonder if they have linux drivers?
...vividly encapsulates that post-Watergate/pre-punk/coked-up moment when you could trust no one, least of all yourself.
As GFX cards get closer and closer to CPUs, Intel are going to fight. Expect the next generation of Pentium to make GPUs obsolete.
And where have you been for the past five days? The paleoanthropology thing again? Inquiring stalkers want to know! ;)
"To confine our attention to terrestrial matters would be to limit the human spirit." -Stephen Hawking
Wouldn't the PCI/AGP architecture be a major bottleneck if the GPU was used as a general processor?
True. But even so, I don't think MS could've pulled it off before Longhorn, especially since it seems like a much more ambitious change than Quartz Extreme was.
:-)
That said, there are a lot of other examples of people accusing Microsoft of "copying" completely obvious improvements from Apple. I thought we liked the concept of sharing ideas here at slashdot?
P.S. My desktop is an x86, but I also have a PowerBook, and yeah, OSX is good.
My Sig: SEGV
I mean, cmon, if you're going to rehash old ideas, at least do your work thoroughly. There was another comment that mentioned the theoretical peak bandwidth of the P4 being quite a bit larger. That's issue one. Issue 2 is: the guy didn't even bother trying to recode in SSE2. Writing in C++ and compiling with optimizations doesn't get you there. You can get something like 2.5x when doing a matrix multiply in SSE land (depending on the version of the hardware you're using). Second, P4ee is irrelevant because he did his tests on a Wilamette! Note, too, that his 1500x1500 matrix is way too large for the CPU cache, so the whole process is memory bound, which shows you nothing about the computational power of the CPU. The GPU is always going to have the memory traffic, but it has a higher speed memory interface to it's internal texture memory, which implies that the GPU wins if you're just doing memory reads (though this is somewhat of a questionable arguement because the GPU still has to contend with the AGP bus). Next, the guy goes on to try to analyze the bottlenecks in GPU design when he apparently didn't even try to figure out why his own damn code ran so slow. Very weak, I say. GPUs are great and all, but they're not the be-all end-all of computation. You can do simple mathematical operations (as long as you don't care about high precision math), but you don't get things you need for crypto or video encoding, like bit operations and good branching (though they are moving toward better dynamic branching). So yah, you can do math on a graphics card...whoopdie doo. Good luck finding a company that will actually code for a GPU. Unless the instruction sets and CPU interfaces stabilize, nobody will want to code for it. Standardized platforms. There is code on Wintel platforms that was coded 8 years ago and still runs. Good luck finding that with a graphics app.
Slashdot: News for over 90% of the mass general public.
There I am, console mode at the command line trying to compile the latest kde release.
...but then without this division there wouldn't be sound and graphics card companies.
I might have a dedicated card:
- for DSP audio
- a dedicated AGP card for graphics
- another PCI graphics card for xinerama
- a PCI VPN card
And yet only the CPU is being used (mostly). This is nuts
Gimmie gimmie my GPU (glx) accelerated GCC!
A blog I run for the wealth
All of the material at that site credits one Zoë Wood as a co-authoress.
images.google.com served up this:
and this: Yowza!!! Somebody needs to introduce me to that chick.http://www.multires.caltech.edu/pubs/GPUSim.pdf
c onjugate gradient and multi grid solver on the gpu
n s/ sig03.pdf
http://wwwcg.in.tum.de/Research/data/Publicatio
linear algebra operators on the gpu
GPUs pass input and output from GPU memory at 4-12 bytes per flop. This is much faster than CPUs which are limited by bus speeds that are likely to deliver a number every sever several operations. So CPU benchmarks are bogus, using algorithms that use internal memory over and over again.
Its not always easy to reformulate algorithms to fit streaming memory and other limitations of GPUs. This issue has come up in earlier generations of custom computers. So, there are things like cyclic matrices tha map multi-dimensional matrix operations into 1-D streams, and so on.
The 2003 SIGGRAPH had a session on this topic showing you could implement a wide variety of algorithms outside of graphics.
Some day you may be able to Fold proteins with your GPU.
Actually, there is a decent trick that can be employed to increase performance when you are faced with computing "something branchy" such as a binary tree on hardware intended for massive matrix manipulation.
Collapse *all* the possible branches into a connectivity matrix and reformulate the algorithm into a parallel one rather than a sequential one. This is pretty much the exact opposite of what Hillis did back in the early 1980s with his *LISP implementations of serial algorithms on massively parallel Connection Machine hardware.
It isn't mindblowing, it just requires a little creativity and mental dexterity. I've used similar tricks myself.
If you want to see it best demonstrated, try an old-new game comaprison of great games. One I like is Doom and Unreal Tournament 2004. The bots in UT 2004, while not as cunning as a human, are pretty damn good. They work together, they understand how to retreat, how to dodge, etc. They actually feel like fairly realistic opponents at lower skill levels (at higher levels their speed and accuracy outstrips their tactics). The enemies in Doom, on the other hand, are STUPID. They have basically only one strategy: follow player, attack them. It's easy to kite them around and draw them into traps. Likewise dodging their shots is simple since they always shoot at where you are now, not where you might dodge to.
Doom is another excellent example since it has been updated with modren graphics. The Doomsday/jDoom project (www.doomsdayhq.com) has rewritten the Doom engine to use OpenGL. Other people have contributed 3d models for enemies, high resolution textures, and quality music (me). The result is something far and away better looking and sounding then the orignal Doom. It has made Doom better, not worse. You thought the music was spooky when your little FM card was squaking it out? Try it now.
Also the graphics cards DO lead to better AI. AI is processed by your CPU and to do good AI, you need a lot of CPU time. That is one of two chronic problems with game AI. You just don't have enough CPU time to spend on it without screwing the game engine (the other is that it is just generally hard to write realistic AI algorithms). Well, modren graphics cards are taking most of the graphics load off the CPU, so there is more time available for other tasks, like AI.
If you can, check out OS X's directory structure, it's beautiful.
Utter crap, fanboy. OS X's directory structure is a basic UNIX system hidden by the file manager, with applications thrown on to '/'.
Quote from that topic: "Reminds me of the good old days when you used the processors in the C64 tapedrive to compute stuff. Wouldn't want to waste those precious cycles."
I wonder if that is where they kept their porn back then also.
Table-ized A.I.
This also could help bring more advanced methods of 3d animation to smaller, cheaper computers/studios.
Incidentally, e17 exploited (hardware) GL rendering of 2D graphics via evas a bit before Apple put that into OS X.
How does that compare? e17 still hasn't been released. Quoth enlightenment.org : Sat May 1 - benr - Enlightenment DR16.7-pre1 Released What you're saying is that e17 was announced before Quartz Extreme was released.
Why do you jackasses have to tack on lies at the end of your otherwise informative posts?
Just because it isn't stable doesn't mean it doesn't exist. Tell all the people running e17 apps like entrance that they really don't have hardware renderning.
This raises the question: would it be possible to get a fleet of systems with multiple AGP ports, so as to utilize multiple GPUs in each machine for scientific and/or industrial purposes (for instance, in the film industry)? Or does the AGP spec not allow for a second AGP slot?
I wonder how quickly someone could break top-notch encryption with 10 servers x 5 GPUs each.
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
Since I get the sense that some people would consider me one of the snobs to which these messages refer, I'll respond.
*YES*, there were plenty of awful games back in the old days, too. The Atari 2600 died as a system, not because of advancing technology, but because of the flood of awful games for it. Nintendo started their infamous licencing plan, not just as a way to control the market for their system and thus make a pile of cash, but also to avoid the situation where hundreds of third-party companies could make horrendous games for their system and damage its reputation. Even so, there was still Color Dreams and the other unlicenced 3rd party developers, and most (but not all - check out Krazy Kreatures) of their games were not worth the silicon they were stamped on.
Old arcade games, likewise, had their share of awful titles. And some older games have game play that doesn't hold up to today's standards. (I actually consider Donkey Kong to be one of these.)
But it's still a fact that many recent games don't offer much more over the games upon which they were based than upgraded graphics. I, and I suspect most of the people who are being slammed here, argue that improved graphics and sound are nice, but not enough in themselves.
Furthermore, think about the number of old games that are still remembered today. There are actually a *lot* of them, most released in a surprisingly short period of time. Space Invaders, the first extremely popular game, was released in 1978. Pac-Man was released in 1981. The Crash was late 83/early 84. That's just six years, and arcades got Robotron, Defender & Stargate, Joust, Asteroids & Asteroids Deluxe, Tempest, Donkey Kong & Jr, Frogger, Sinistar, Battlezone, Pac-Man & Ms. & Super, Centipede & Millipede, Pole Position, Tron, Q*Bert, Star Wars, and a good number of games I've neglected. Look at arcades today, hell I'll even let you include consoles and PCs, and try to argue there's a similar amount of innovation.
Further, all of these games are substantively different from each other. Even most obvious sequels here, the Pac-Man trilogy, had enough differences that a player good at one might not be at one of the others: Ms. Pac-Man was resistant to patterns, and Super Pac-Man had the keys/doors play mechanic that allowed the player's actions to change the maze.
Now there *were* blatant-rip-offs in those days. There were many Pac-Man clones produced, most inferior to the original game and some that were basically the original game with hacked graphics. Many of these were bootlegs. And there were games that were released with minor graphics and gameplay changes, like Super Zaxxon.
But in general, it was a lot easier then to make a game with a unique play mechanic, try to sell it and succeed than it is now. That's why old arcade and console games get released in 24-packs for $20 these days and sell pretty well. Nostalgia is certainly a factor, but it helps that the games stand up.
The peak throughput of altivec is about twice that of SSE. P3/4 and K7/8 can all do 2 single precision adds and 2 muls per cycle while G4/5 can do 4 fused multiply-adds.
Actual usable performace is also way better since you need less instructions for the same thing since they don't need to overwrite input registers and you have more registers around as well.
(then the real problem is that there's little properly vectorized code around..)
perhaps he was referring to the .app directories that contain all resources, non shared libs, executables and various resources necessary to an application, opposed to sputtering crap all over the c:/ filesystem... utter crap, M$ fanboy...
Mi domando chi à il mandante di tutte le cazzate che faccio - Altan
Doing eyecandy for Longhorn
Good call. Updated.
Files aren't supposed to go all together; they're supposed to be divided by type: /bin, /etc, /lib, etc.!
- UNIX fanboy
(yes, that was a joke; actually, I'm looking forward to database-based file systems - but not proprietary ones)
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz
"Windows users might have it by 2007"
Longhorn will likely ship in 2006.
Moreover, the DCE goes far beyond what Quartz extreme does. QE is actually quite primitive - very few operations are GPU accelerated. That's why window resizing is still horribly slow.
The problem is Moore's law. There is always going to be some special purpose DSP-GPU-coprocessor thingy that is faster than the CPU, but if you wait long enough, the CPU is going to be fast enough, and the CPU is going to have a much bigger customer base for your software product than requiring people to go out and buy the DSP-GPU-coprocessor du jour.
All we need now is a Linux distro that will run in a Windows' window and have it's own processor (GPU).
Boy, you really have no idea what the heck you are talking about, do you? Of course the basic UNIX stuff is there, /bin, /sbin, /usr/local, all that stuff.
Those directories have very little files in them, you will also notice a lack of init.d startup scripts. Most of the system is contained in /System.
For example, rather than /etc/init.d, it has startup services in /System/Library/StartupItems. For example there is an apache folder, in that are the scripts necessary to start Apache along with a file which describes Apache's dependencies. Also, these startup items are multi lingual. You can boot into any language you want. All of this in one folder. That's f*cking elegance, yet it is only a very small example.
Check it out, you will see.
And, of course, we all know that Apple will make absolutely no improvements from now until 2006 when Longhorn ships.
Personally, I don't think Longhorn will ever ship because each time Apple releases the next version of OS X, Microsoft has to scrap all their Longhorn work and start over again to catch up.
You can tell a great deal about the character of a man by observing those who hate him.
There's some speculation that a fairly popular audio DSP card uses a Chromatic Research video chip. This card supposedly very accurately emulates some very prized vintage audio processing equipment like the 1176 and the LA2A compressors and limiters. http://www.chrismilne.com/uadforums/topic.asp?TOPI C_ID=579
I tried the UAD-1 before, and I found the biggest problem was the extra latency that it ADDED to my signal chain. Pulled it out and took it back to the store.
sleep
And the DCE will beat the shit out of Quartz Extreme. Yes, in late 2006 Longhorn's DCE will do much, much more than what Quartz extreme does *now*.
Apple has shown an outstanding ability to rock the OS world every year since OS X was released. And you never now how their next OS will amaze you until it's presented very few months before it's released (usually at the WWDC). Who knows where OS X will be by 2007^H6 ? (That is, outside Lord Job's closest circle.)
Oh, and apparently the acronym DCE (Desktop Composition Engine) was changed to DWM (Desktop Windows Manager).
This is completely off topic. But, way way way back when my buddy's 60s mustang fastback crapped out on him. The only thing we had to get it back to the baracks was his kawasaki 400. Well, he ties em together with some rope and gives me some "pointers" on driving the towed vehicle. Since I had just learned to drive (in a military jeep), experience was not my strong suit. At any rate, he manages to get us going and tows me through the fog of burnt clutch smell he created and things are going fine until we start going down a slight hill. I'm catching up to him fast and panic, I tap the brakes, the rope snaps, he brakes, I let go, I bump his bike, his tire squeals, he guns it and I coast to a stop at the bottom of the hill. After I got out he started giving me the business about how I suck as a driver yada yada yada when a n MP (military police) jeep pulls up. The cop gets out and asks us why we are out here at 4 in the morning and Rocky feeds him some shit when the cop sees the broken rope. He asks, "you guys trying to tow this car with that motorcycle", we laugh, and reply in unison "of course not". He warns us about the associated dangers and leaves us to deal with the dead car. At that point, we decided to just push the car back. I got to bed around 6am that morning.
I'm not making this up....
You don't think that the world's largest software company will ever release a new version of their most profitable product?
"And, of course, we all know that Apple will make absolutely no improvements from now until 2006 when Longhorn ships."
Of course they will. But from what I've seen of Longhorn, it is a fundamental step forward in the way people interact with their computers. There have been very few fundamental UI changes to OS X since its release. Refinements, polish, new features - yes. OS X is a very nice OS, and it will be more so by the time that Longhorn is released. But in 2006, OS X will still be OS X. Longhorn is something totally different from XP. It is a different kind of operating system. That's what people don't get.
YHBT. YHL. HAND.
As for organizations beating slashdot to the punch on this one, that's true... but it's good to see this getting even more exposure. :)
GPGPU (General-Purpose computation on GPUs) was a hot topic at various conferences in 2003; a number of papers were published on the subject. At SIGGRAPH 2004 there will be a full-day course on GPGPU given by eight of the experts in the field (including myself).
Mark Harris of NVIDIA maintains a website dedicated to GPGPU topics, including discussion forums and news postings. Well worth a browse if you're interested in GPGPU topics.
I look forward to seeing some of you at SIGGRAPH! :)
--Cliff
Whether he or she was trolling was besides the point. I'm just putting out facts for the benefit of others.
No, that's not what I'm saying at all. I was only pointing out that the notion wasn't an insight discovered by geniuses at Apple that non-promethean sheep can merely copy. The original poster erroneously conflated Apple's GUI architecture with the topic at hand and made a point of noting that they "beat us to it". If you go here you'll find a Slashdot article from January 21, 2001 entitled, "Rasterman's New Toy: EVAS". Evas was (still is?) his canvas library that included a (hardware accelerated) OpenGL backend. I was building it from cvs at the time, clicking "Hardware" on the demo program, and watching the FPS go nuts. E17 may never see the light of day, but the first evas (IIRC, it was dropped and rewritten) was a functioning, solid proof of concept that received a lot of attention. Jaguar was released a little more than a year and a half later on August 24, 2002.
What do you expect, he's a phd, you can't expect him to be based in the real world...
You people remind me of some of my students who constantly make me wondering whether one indeed needs a philosophiae doctor degree to have any sense of humour whatsoever... Apparently the sophisticated art of satira mustn't be "based in the real world" (sic), must it? In any event, I find this "real world" antiintellectualism certainly amusing, especially when I get moderated down on Slashdot because of my doctorate or Mensa membership. Annoyingly infantile, yet amusing.
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
So a special purpose processor is better at doing some things than a general purpose processor. How new is that? That Cray X-1 upstairs is very good at doing vector operations, however it sucks at doing anything scalar (like compiling, perhaps?) This is why Cray gives you a general purpose workstation to go with the X-1 as a compilation workstation (among others).
You don't think that the world's largest software company will ever release a new version of their most profitable product?
You may want to reboot your ironic humor sensor.
You can tell a great deal about the character of a man by observing those who hate him.
http://www.cs.unc.edu/GP2
The participants include researchers and developers from computer architecture, software systems, high-performance and scientific computing in addition to computer graphics. The "Call for Posters" will be coming out soon (with deadline of June 01). Check the WWW site for more details.
I used to do something similar back in grad school with our laser printers. The CPU inside the laster printers (some version of a Motorola 68000) was more powerful than the machines that we had on our desks, and they had a fair amount of memory as well. So, for some computations, I'd write little PostScript programs, have the laster printer churn on it overnight (of course, no one could print in the meantime), and then print out the answer when it was done.
But....can you run linux on it?!
No, really, can you?
AB HOC POSSUM VIDERE DOMUM TUUM
General-Purpose Computation Using Graphics Hardware. Anyone interested in this topic should check that site out.
And thanks to other readers for the follow-up replies.. I'm looking at gpgpu.org with interest.
stop crying that no one likes you.
I was not crying, not at all. Quite to the contrary, in fact--I was laughing. That is what I usually do when I find something amusing. And that is why I have written that I found it amusing. I thought it was self-explanatory.
I wouldn't call it sarcasm but rather satira. (And, in fact, I have called it satira.) To be honest, I fail to understand how could I have achieved said satira without having been wrong... Do you really think that "Using GPUs For General-Purpose Computing--I'm glad that finally they started to use the Graphical Processing Unit" would have been moderated as Score:5, 100% Funny?
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
Rasterman actually did the same thing before. I think it was 6 years ago when Evas used OpenGL for its compositing of pixmaps to be used for Enlightenment 0.17. Unfortunately, E0.17 still isn't there yet.
It's a bit more plebian computer nerdy than that. There's a new mod out for Grand Prix Legends. I've been gaming and catching up on the related forums.
That and working on some ideas for a bicycle towed popup camper, spurred by acquiring a pair of wheels really cheap at a garage sale, which is why that particular post caught my eye and engendered my particular response.
Towing a popup camper with a motorcycle actually makes a lot more sense than towing one with a minivan, and you can buy commercial products.
For bicycles I'm reduced to DIY. It poses some interesting engineering problems.
KFG
"To confine our attention to terrestrial matters would be to limit the human spirit." -Stephen Hawking