Slashdot Mirror


Harvard/MIT Student Creates GPU Database, Hacker-Style

First time accepted submitter IamIanB writes "Harvard Middle Eastern Studies student Todd Mostak's first tangle with big data didn't go well; trying to process and map 40 million geolocated tweets from the Arab Spring uprising took days. So while taking a database course across town at MIT, he developed a massively parallel database that uses GeForce Titan GPUs to do the data processing. The system sees 70x performance increases over CPU-based systems, and can out crunch a 1000 node MapReduce cluster, in some cases. All for around $5,000 worth of hardware. Mostak plans to release the system under an open source license; you can play with a data set of 125 million tweets hosted at Harvard's WorldMap and see the millisecond response time." I seem to recall a dedicated database query processor that worked by having a few hundred really small processors that was integrated with INGRES in the '80s.

3 of 135 comments (clear)

  1. Re:I'm not a computer scientist, and... by Anonymous Coward · · Score: 4, Insightful

    Sprinters can run really fast. So, if speed is important in other sports, why aren't the other sports full of sprinters? Because being good at one thing doesn't mean you're well-suited to do everything. A sprinter who can't throw a ball is going to be terrible at a lot of sports.

  2. Re:I'm not a computer scientist, and... by anagama · · Score: 3, Insightful

    I think you totally missed his point -- tin whiskers on your circuit board? Blown caps?

    The fact that 9 women can have 9 babies in 9 months for an average rate of 1/mo, does not disprove the assertion 9 women cannot have __a__ (i.e. a single) baby in one month. You're talking about something totally different and being awfully smug about it to boot.

    --
    What changed under Obama? Nothing Good
  3. Re:Q: Whats better than a GPU database? by tmostak · · Score: 3, Insightful

    Try to heatmap or do hierarchical clustering on a billion rows in a few milliseconds with just the aid of indexes - not all applications need lots of cores and high memory bandwidth - but some do.