Cheaper, More Powerful Alternative To FPGAs
holy_calamity writes "Technology Review takes a look at a competitor to FPGAs claimed to be significantly faster and cheaper. Startup Tabula recently picked up another $108m in funding and says their chips make it economic to ship products with reconfigurable hardware, enabling novel upgrade strategies that include hardware as well as software."
I'd like a cheap alternative to FPGAs but the whole notion of hardware upgrades through programmable circuitry is pretty flawed.
I do not think the FPGA works the way they think it works. For instance an ARM processor is going to be faster less power hungry then a FPGA programed as a ARM processor. It can't grow a blue-tooth or a GPS etc. The FPGA is also in the same improvement cycle as any other part so the newer phone will have the better FPGA. I am not saying having one in there is bad it is nice for tweaks to the system but it is not a magic bullet.
Mmmm.... 40+ years after going out of style as "Hopelessly Obsolete", Delay Lines return to the cutting edge.
I've got kicked out of school with an EE degree, gone into software business (yeah, I know), and never looked back.
Do they ship products, other than dev kits, with FPGA?
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
With proper implementation, you could build chips that essentially are functional programs with this, and swap between programs as required. Fans of Haskell would likely realize some interesting benefits.
In Xanadu did Kubla Khan
A stately pleasure dome decree
The real problem with FPGAs is the painfully byzantine tools you have to use to deal with them. The chips themselves are fine.
There is a lot of room for disruption in the programmable logic tools industry. If this company is smart, they will focus on workflow and toolchain innovations, rather than becoming too distracted by shiny silicon baubles. Shorten the edit-simulate-synthesize-test cycle and you will make a lot of people happy.
Then again, you should never argue with a man who buys his ink by the gallon, or his wafers by the acre.
No mention of how easy these things are to program. Timing constraints will be very tight, and what happens if clock skew carries signals across folds? Any success depends on how well the accompanying tools can implement the standard synthesis flow to support multiple levels.
To me, after reading the papers, it looks like they reinvented (or reimplemented) Transputer architecture, but in a single chip, and with a different API.
"and silicon [wafer] costs roughly $1 billion an acre."
Ok, lets put that in real units. How about a 100mm2 = 1cm2 chip die.
Hmm, that works out to only $24.
Why on earth did they pick the silly number of $1B an acre? Is it just a stupid PR scare number.
Anyone ever heard of a chip that uses acre's as a die size?
"Obsolescence is the curse of electronics" ... Uhm, no it's how companies make money...
Say you have an FPGA on your computing device. ...ZAP... again way faster...
You decide to watch a video stream,
ZAP... the correct decoder is now in your FPGA
you are decoding way faster than in software running on a generic CPU.
Done viewing that,... switch to encoding
Intel put custom logic in their new SoC to handle encoding/decoding,
but it takes up space and what is there is limited and not changing.
You can have just about any heavy CPU task offloaded to custom hardware (assuming it fits on the FPGA)
And when you're not using it, it shuts down saving power.
So why don't they build systems like this?
If you're designing an ASIC, one traditional method is to do your design, flash it to FPGA, test it, debug, repeat, and when you're done, send it out to the fab to get it burned into ASIC. So yes, it's hardware upgrades through programmable circuitry, and you might be doing multiple upgrades per day.
If you're doing small production runs of chips, for instance for custom hardware, you may want something that's fast but you're not going to make 10,000 of them so you don't want to pay the price of burning ASICs. (ASIC prices have gotten a lot cheaper in the last decade, and production cycles have gotten faster, but it still takes time and money.) So don't - just do the chip in FPGA. And just like providing firmware in EPROM, the fact that the chip's reprogrammable doesn't mean you'll necessarily be doing that in the field.
These guys are basically doing a smaller cheaper FPGA design, as far as I can tell from the article and the comments. Those sound like good things.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Teig estimates that the footprint of a Tabula chip is less than a third of an equivalent FPGA, making it five times cheaper to make, while providing more than double the density of logic and roughly four times the performance.
That is 6X more impressive than any other use of factors in a sentence... ever.
the guy behind Tabula is behind a number of "failwins" in the electronics industry - a fail in that the technology ended up being pointless and rejected by the market, but wins in that his companies were all bought out by suckers for quite a bit of $$$$
two examples:
- X initiative (use 45 degree routing on chips) - look at http://www.xinitiative.org now - 100% dead. look at it, and all the wonderful claims he (and his sucker followers) made in archive.org.
- Simplex solutions - built a large number of poor quality EDA tools (poor because they never got adopted and so never got the real bugs worked out and features required for real work) but looked very shiny, so were sold to cadence for a fairly large sum of money (relative to the low dev. cost). All but one of the simplex tools (now called cadence QRC) has been EOLd by cadence, and QRC will be thrown out just as soon as anyone cares enough to replace it with something better.
You can bet Tabula, if it succeeds at all, will be another failwin. It will be bought by one of Xilinx or Altera (the current FPGA duopoly), a couple of minor good ideas will be incorporated into future products and the overwhelming majority of the Tabula technology will be promptly forgotten. ...why? I hear you ask?
The reason is simple: Steve Teig has realized that "spamming" technology really does work (for him) - he has figured out that he can leave it up to much larger corporations to figure out, in their own sweet time, why 99% of his ideas sound great but are actually pointless, in the months and years after they are fooled into acquiring his techno-spam through an acquisition.
From one of his many online bios:
He holds over 220 patents. In 2002, he broke Thomas Edison’s record for the number of patents filed by an individual in a single year.
Enough said.
Link to a better writeup, one that doesn't attempt strained architectural analogies (ignore the first paragraph or three, but do look at the comments).
The disruption you mention almost happened in the early 90's. NeoCAD produced a compete competing tool chain for Xilinx FPGAs, including the place and route, for the then state-of-the-art 4000 series. Their software was better than Xilinx's, including things like a graphical layout editor. Xilinx was having none of it and bought NeoCAD. Quite a few NeoCAD features made it into the Xilinx software, eventually. Soon after that Xlininx started publishing less information on their FPGA's interconnect networks, and there has never been another attempt at writing such software.
Personally, I think writing a clone of the Xilinx software, today, is the wrong thing to do. It would be less effort to design and manufacture an "open source" FPGA, and write the necessary software from scratch, than to reverse engineer Xilinx's place and route.
I remember Starbridge, and their audacious claims, and this company sounds like it's trying to accomplish something similar, but aren't being as audacious with their claims.
I do look forward to software reconfigurable hardware, but that does mean it brings a whole new meaning to the word "bricking."
// file: mice.h
#include "frickin_lasers.h"
I have an EMU Proteus 2000 midi synthesizer that happens to have an FPGA in it. (I can't remember off the top of my head if it was a Xilinx Spartan or an Altera Cyclone, but I think it was one of those.)
...but it has fast context switching built-in. And you can't control when the contexts switch, they always go in order (as they should, since they're all statically assigned, and are different parts of a single problem, rather than separate problems).
For those that don't know how FPGAs work, here's a basic crash course: they have lots of blocks, each one has a look-up table (say a 4-LUT; 4 inputs, 1 output). The LUT is basically a "read-only" RAM with 4 address bits (so 16 addressable locations), and one data bit. The RAM can be rewritten (this is what is done when they program an FPGA), but it's fairly slow. Tabula changes it up a bit so that each addressable location is 8 bits instead of 1 bit. Since transistors are basically free on an FPGA (they're wire-dominated), this doesn't cost much, and it means that they can time-share pieces of silicon for different purposes without the penalty of reprogramming the chip. Then, each cycle, it'll pick a different one of the 8 bits (though the address, or inputs to the 4-LUT, may be changing at the same time).
It's a fairly straightforward idea, though there's a fair amount of complexity added to the design tools.
However, it's not free. You now have lots of high-speed logic, which is probably using tons of power, and it's switching frequently, which is using tons more power, and even when it's not, it's probably fairly leaky, using even more power. Effectively, you have a 1.6 GHz chip, but to you it seems like it's only running at 200 MHz - but it can do ~8 times more processing per silicon area. You might also think of it as being similar to the Pentium 4 integer units; they ran at twice the clock speed of the rest of the chip, so it seemed like there were twice as many of them (so a single IU could do an add in the first half of a core clock cycle, and a subtract in the second, computing two instructions per cycle).
So this chip is basically trading latency for computing power. The more operations you need to do, the slower it will run, because it'll take more of their folds to implement your logic.
ASICs actually got more expensive. The individual ASIC is cheaper now, but the non recurring costs of making a ASIC went up a lot. Smaller process nodes need more masks and more complicated masks.
If your mask set is $2.000.000 and you are going to sell ASICs 10,000 made with it, even if the individual ASIC is free after paying for the masks, you are still at $200 per piece. The $100 FPGA is a better option then and at 10.000 pcs you are going to get a pretty large fpga for $100.
Jan
what kind of process they're using, I imagine it will be a 40nm process or some similar feature size. What if we all just concentrated on making cheap short run fabrication machines, maybe something that could make a 150nm feature-size on pre-sliced wafers. That way I could quickly print something up in-house. Maybe my design could have some re-programmability, but I can't see that being the biggest use for FPGAs. Even if post-shipping re-programmability is feasible, I doubt many FPGA designs actually use it in their lifetimes.
Nullius in verba
Here's the thing people don't seem to realize: FPGAs *are* cheap.
They are. I do embedded design for a living. And I'm yet to see a design cross my desk that doesn't have a Xilinx or Altera chip on it. You see them typically used as glue logic, like buffers in between the cpu and pcmcia slot. Or clock generation. Discrete components for that would be *far* more expensive.
FPGAs are already a bargain. Sure, if something cheaper and faster comes along that'll be great. But not really necessary.
Weaselmancer
rediculous.
http://papilio.cc/ Home of the Papilio FPGA board, which has a similar intent to the Arduino. It currently supports a stack CPU and an AVR emulating CPU. The AVR CPU supports the Arduino tool chain. Here is another site for projects with this board. http://gadgetforge.gadgetfactory.net/gf/. You can get it for $US 50 or 75, depending on the FPGA size.
The Gameduino http://excamera.com/sphinx/gameduino/ is an Arduino shield with an FPGA that supports sprite graphics for old school game play. The FPGA code includes a Forth engine that runs as 50 MHZ. Programming is done on both the Arduino and the FPGA board.
Why is Snark Required?
"The power consumption if these devices is relatively high, and likely too much for a device like a phone" Dead giveaway that this is a marketing story, not a real proven technological renovation.
I hate being bipolar; it's awesome!
That's some impressive funding, 108 millidollars - that's a whopping 10.8 cents!
Isn't that what SSDs are for?
No sig today...
Now back to the original program. Some year fer fools day, /. needs to randomize Preview/Submit.
That's not a bad overview, but you need to apply some intelligence to the power consumption figures.
A|G || B|H
-+- || -+-
C|E || D|F
My first instinct is to set up the eight bit shift register as a pair of four element squares; one clocking on the rising edge, the other on the falling edge. A mux at the bottom selects from the left/right square on alternate cycles.
Your clock is 800MHz instead of 1.6GHz. The time vias probably need at least a full clock cycle (no path from one clock edge to the next edge in the opposite direction).
The four-element circular shift register (you have these by the gallon) is highly optimized. It probably doesn't cost you much to cycle patterns 0000 or 1111. Patterns 0111(x8) and 0011(x4) have two edge changes per cycle on the 800MHz clock. Pattern 0101(x2) has four edge changes per clock. The software might optimize layout and placement for fewer of these patterns.
That will beat an 8-way mux, I think, in total logic transitions.
The design gives you more logic operations per unit of silicon; leakage current for unit of performance should only improve.
On average, you're driving signals less distance, so maybe average capacitive load is partially ameliorated. For some applications, the computational density will permit meeting performance goals with more efficient logic trees.
On the other side of the coin, you're totally at the Merced of the synthesis tools. I think this Meta-trans concept is extremely cool. The main problem is that no one has yet invented a de-stealthing tool that actually works.
Everyone run for cover, we're being attacked by Ro'ula's. If I were Xilinx, that would be the code name for my competitive response.
The only FPGA I've used in my own design was a Spartan DSP. Heinlein's magic box isn't going to do you much good implementing 18x18 Wallace trees or adding conventional compute cores.
It's optimized for a very high LUT/pin ratio, in a small, hot package, discounting macro blocks.
I was more enthusiastic about mixed signal ASIC technology from Triad, but on my initial inquiry they haven't lowered the cost of full-custom analog ASICs at the low end. What they seem to offer is a fairly expensive, but far less risky proposition (if theory translates to practice) for medium complexity design.
I would have needed to Scotty our projected volumes to get a second response.
From a SSD it still has to be loaded up and put into active memory, with a programmable cpu it could already be there and ready to go, the second power on is achieved. Plus there is the possibility of performance gain with a parallel processor OS.
Chaos - everything, everywhere, everywhen