Slashdot Mirror


Cheaper, More Powerful Alternative To FPGAs

holy_calamity writes "Technology Review takes a look at a competitor to FPGAs claimed to be significantly faster and cheaper. Startup Tabula recently picked up another $108m in funding and says their chips make it economic to ship products with reconfigurable hardware, enabling novel upgrade strategies that include hardware as well as software."

2 of 108 comments (clear)

  1. Re:Me like by gmarsh · · Score: 5, Informative

    Here's the thing people don't seem to realize: FPGAs *are* cheap.

    Case in point: Xilinx XC3S50A. $5.75 at Avnet. Comes in a hobby-solderable VQFP and you can make it work on a 2-layer board. Add a SPI flash to boot from (or a nearby micro with ~50K of spare flash), an oscillator, and +3.3V/1.2V regulators for power and you're still under 10 bucks parts cost - in low quantity.

    This chip is only bottom of the line, but it's full of awesome stuff - "DCM" clock multipliers that can let you run FPGA designs at 250+ MHz by multiplying up slow external clocks, three 18x18 multipliers that run at almost the same speed, three 2Kbyte SRAM blocks that you can use as instruction/data memory for processors (eg, a Picoblaze, which can run at 100+ MHz).

    These are great little things to play with as a hobbyist. I've contemplated making an Arduino shield with a small, cheap FPGA for people to experiment with, but I never really could figure out any good way to get data and signals in and out of the chip in a way that shows off what FPGAs are really good at.

  2. It's basically the same as any other FPGA by Macman408 · · Score: 5, Informative

    ...but it has fast context switching built-in. And you can't control when the contexts switch, they always go in order (as they should, since they're all statically assigned, and are different parts of a single problem, rather than separate problems).

    For those that don't know how FPGAs work, here's a basic crash course: they have lots of blocks, each one has a look-up table (say a 4-LUT; 4 inputs, 1 output). The LUT is basically a "read-only" RAM with 4 address bits (so 16 addressable locations), and one data bit. The RAM can be rewritten (this is what is done when they program an FPGA), but it's fairly slow. Tabula changes it up a bit so that each addressable location is 8 bits instead of 1 bit. Since transistors are basically free on an FPGA (they're wire-dominated), this doesn't cost much, and it means that they can time-share pieces of silicon for different purposes without the penalty of reprogramming the chip. Then, each cycle, it'll pick a different one of the 8 bits (though the address, or inputs to the 4-LUT, may be changing at the same time).

    It's a fairly straightforward idea, though there's a fair amount of complexity added to the design tools.

    However, it's not free. You now have lots of high-speed logic, which is probably using tons of power, and it's switching frequently, which is using tons more power, and even when it's not, it's probably fairly leaky, using even more power. Effectively, you have a 1.6 GHz chip, but to you it seems like it's only running at 200 MHz - but it can do ~8 times more processing per silicon area. You might also think of it as being similar to the Pentium 4 integer units; they ran at twice the clock speed of the rest of the chip, so it seemed like there were twice as many of them (so a single IU could do an add in the first half of a core clock cycle, and a subtract in the second, computing two instructions per cycle).

    So this chip is basically trading latency for computing power. The more operations you need to do, the slower it will run, because it'll take more of their folds to implement your logic.