Why Don't Open Source Databases Use GPUs?
An anonymous reader writes "A recent paper from Georgia Tech (abstract, paper itself) describes a system than can run the complete TPC-H benchmark suite on an NVIDIA Titan card, at a 7x speedup over a commercial database running on a 32-core Amazon EC2 node, and a 68x speedup over a single core Xeon. A previous story described an MIT project that achieved similar speedups. There has been a steady trickle of work on GPU-accelerated database systems for several years, but it doesn't seem like any code has made it into Open Source databases like MonetDB, MySQL, CouchDB, etc. Why not? Many queries that I write are simpler than TPC-H, so what's holding them back?"
...because I/O is the limiting factor of database performance, not compute power?
MapD is a GIS-centric database.
Databases in the real world are rarely cpu bound (and when I have seen them CPU bound it was when something was going badly wrong) Generally they are data bound and the GPU has several times lower bandwidth than the real cpus so effectively will be even slower, so while the computation on the gpu may be 10x faster...feeding the data in/out is 10x slower meaning it did not do anything for you, except require you a lot of extra coding complication do use it.
Benchmarks tend not look like real world queries, of often you can do something that helps a benchmark, but does nothing in the real world,.
Bus installed co processors (pci/pcie/vme) are only useful if you can fit the entire dataset in the co-processors memory, when you have to do large accesses outside of that ram because the data does not fit, then the co-processor usually becomes much slower and all advantages go away. That is why it works for supercomputing...the dataset being worked on is tiny in the cases the gpu works well for.
The R&D effort in the SQL field is roughly zero, so it's not surprising people aren't keeping up with the latest developments in the hardware field.
Except for the part where errybody's keeping up with the latest developments. They're just actually looking at developments that matter. GPUs... Do not matter. If you want to know more, check the first post.
Processing power is inconsequential compared to I/O. RAM is pretty straightforward; newer, faster RAM comes out, larger amounts become cheaper, you buy it, you throw it into the mix.
The cool stuff is happening around SSDs (which are also pretty straight forward), solid state memory devices (think FusionIO-style cards; Violin devices; RAMSANs), and crazy arse storage solutions.