Researcher Shows How GPUs Make Terrific Network Monitors
alphadogg writes "A network researcher at the U.S. Department of Energy's Fermi National Accelerator Laboratory has found a potential new use for graphics processing units — capturing data about network traffic in real time. GPU-based network monitors could be uniquely qualified to keep pace with all the traffic flowing through networks running at 10Gbps or more, said Fermilab's Wenji Wu. Wenji presented his work as part of a poster series of new research at the SC 2013 supercomputing conference this week in Denver."
So in violation of /. convention, I went ahead and read TFA in hopes that were would actually be something more than "we solved yet another parallel computing problem with GPUs." Nope, nothing. Not even some useless eye candy of a graph showing two columns of before/after processing times.
And the article just *had* to be split into two pages because it would have killed them to include that tiny boilerplate footer on page one. What a fail...at least it wasn't a blatant slashvertisement!
NSA already does this, how else you think they process all that data?
It's like saying that GPUs are "terrific" for Bitcoin mining, until you realize that they require one or more orders of magnitude more power for the same amount of processing than specialized hardware. And network monitoring is probably a common enough task that it's worthwhile to use hardware tailored to this particular job.
Get an FPGA development system and implement your hardware in the FPGA, then ask a chip manufacturer to turn it into an ASIC. Expect to pay bucketloads of money on the way, though. It's only feasible if either costs are not an issue or you expect the resulting device to be mass-proced (six or better yet seven digit numbers manufactured per yeat).
In practice, most people who publish results of a new algorithm ported to GPU do not have a version well-optimized for CPU, or aren't that good at optimization in the first place. I've had several cases where I could make the CPU version faster than their GPU version, despite them having claimed a x200 speed-up with the GPU.
If you have a fairly normal algorithm in terms of data access and your speed-up is bigger than 4, you're probably doing it wrong.