Star Bridge FPGA "HAL" More Than Just Hype
Gregus writes "Though mentioned or discussed in previous /. articles, many folks (myself included) presumed that the promises of Star Bridge Systems were hype or a hoax. Well the good folks at NASA Langley Research Center have been making significant progress with this thing. They have more info and videos on their site, beyond the press release and pictures posted here last year. So it's certainly not just hype (though $26M for the latest model is a bit beyond the $1,000 PC target)."
uhh, so which link is the story?
26M? hah! i save that much every year pirating software and audio off the net.. puh-leez!
That's directly from their site. I wish the /. summary would have mentioned parallel hypercomputers. And note that when you search Google for "parallel hypercomputers", you only get get the one hit from Star Bridge Systems (and soon you'll get a hit for this comment on /. ;-)). No wonder people thought this was a hoax.
--sex
Very popular slashdot journal for adul
Here we see the solution to the problem of too many comments about a /. story. Simply obfuscate the story so much that no one can figure out what it's about, or even find the link to the original. Hats off to Gregus and CowboyNeal for the idea.
I am TheRaven on Soylent News
I like that little "page loaded in x seconds" blurb in the corner.
I'm having waaay more fun then i should be refreshing the page and watching the load times get longer . . . and looooonnnnger . . . . and looooonnnnnnngggggggger.
Hey, it beats workin'.
Look out honey cause I'm usin' technology
Ain't got time to make no apologies
Well the good folks at NASA Langley Research Center have been making significant progress with this thing
I can see it now...
*techie smacks the machine
HAL: "I know I've made some very poor decisions recently, but I can give you my complete assurance that my work will be back to normal."
I may be wau off here but it seems to me if your going to market this sort of product for the consumer market the point to emphasize would be the potential to pump out out millions of dirt cheap little processors exactly the same in manufacturing. and then apply the relevant "program" and turn them into a control chip for a coffeemaker or alarm clock or whatever
Now, maybe someone will be able to make this go. But this company doesn't look like it. If you manage to get to their web site and look at the programming language "Viva" they have designed, it looks like you are drawing circuit diagrams. Imagine programming a complex algorithm with that.
There are already better approaches to programming FPGAs (here, here, here). Look for "reconfigurable computing" on Google and browse around.
For a start: chip designers everywhere use FPGA:s to prototype their designs. No magic; they are reasonably fast (but not as fast as custom designed chips), and way more expensive. Having a large array of them would indeed make it possible to run DES at a frightening speed -- but so would a mass of standard computers. The sticking point is that the collection of FPGA:s emulating a standard CPU would be way slower for any given budget for CPU:s than a custom chip (like the PII, PIII or AMD K7) -- and way more expensive.
:)
Think about it: both Intel and AMD (and everybody else) uses FPGA:s for prototyping their chips. If it was so much more efficient, why do they not release chips whith this technology already?
As for the reprogramming component part of this design: translating from low-level code to actual chip surface (which it still is very much about) is largely a manual even for very simple circuits, largely because the available chip-compiler technologies simply aren't up to the job.
Besides, have any of you thought about the context-switch penalty of a computer that will have to reprogram its' logic for every process
I looked at this site several years ago, and thought, "whoa, cool idea, FPGAs would make a really fast computer." Then for two years, nothing to show for this idea. And after I programmed some FPGAs, I realized (at least partly) why: they're too slow to program. It takes on the order of milliseconds to reprogram even a moderate-sized FPGA.
And even a very large FPGA would be pretty lousy at doing SIMD, vector ops, etc. Basically, they would suck at emulating a computer's instruction set, which is (fairly well) optimized for what software actually needs to do. I can't think of many algorithms used by software today that would work much better in an FPGA, except for symmetric crypto. And if you need to do that, get an ASIC crypto chip, 10s of dollars for umpity gigs/second throughput. SPICE might also run a bit faster on these (understatement), but those types already have decent FPGA interfaces.
Furthermore, the processor programming these FPGAs must have some serious power... if you have to do many things on an FPGA at once (which you do if there are only 11 of them), you basically have to place & route on the fly, which is pretty slow.
So, I don't think that these "hypercomuters" will ever be any better than a traditional supercomputer in terms of price/performance, except for specialized applications. And even then, it won't be any better than an application specific setup. And not many people need to go back and forth between specialized tasks. (Who am I to complain of price/performance, I'm a Mac user?)
That said, if they *can* put a hypercomputer on everyone's desk for $1,000.00, more power to them!
I hereby place the above post in the public domain.
Going from 4GLOPS in Feb'01 to 470GFLOPS in Aug'02 for ten FPGAs, that's 120 times faster in little over a year. Not bad.
Any thoughts on what this means for crypto cracking capability?
We started using FPGA's in our HPC designs where I work several years ago - the designs are faster, more reliable, and quicker to design. StarBridges graphical development environment is a lot like another product sold by Anapolis Micro called Corefire.
Corefire is a java based graphical (iconic)development environment for Xilinx FPGA's. It is like anything else though sometimes programming in VHDL will be a better choice, it depends on the complexity of the design and the desired end result. But all in all we probably saved at least 6 man-months of design time using Corefire.
Your original search: hypercopmuter returned zero results.
The alternate spelling: hypercomputer returned the results below.
Here's a Feb'1999 Wired Article that explains what Star Bridge considers a hypercomputer.
--naked
Very popular slashdot journal for adul
More technical information is found in MAPLD Paper D1 and other reports. NASA Huntsville, NSA, USAF (Eglin), University of South Carolina, George Washington University, George Mason University, San Diego Supercomputer Center, North Carolina A&T and others have StarBridge Hypercomputers they are exploring for diverse applications. The latest StarBridge HC contains Xilinx FPFAs with 6 million gates compared to the earlier HAL-Jr with only 82,000 gates. Costs are nowhere near $26 Million. NASA spent approx 50K for two StarBridge Systems.
FPGAs worked pretty well here because they could handle the fire hose data rate from front to back. Their final output was a small nuumber of processed bytes so that could then go to a normal computer for display and storage.
the problems the engnieers had was two fold. first in the early chips there were barely enough gates to do the job. and in the later ones form xylinx there were plenty of transistors but they were really hard to design properly. the systems got into race conditions were you had to use software to figure out the dynamic proerties fo the chip to see if two signals would arrive at the next gate in time to produce a stable response. you had to worry where on the chip two signals were coming from. it was ugly and either you accepted instability or failed prootypes or you put in extra gates to handle synchronization--which slowed the system down, and caused you to waste precious gates.
still my impression at the time was WOW. here is something that is going to work, its just a matter of getting better hardware compilers. Since then Los Alamos has written a C compiler that compiles C to hardware and takes into account all these details it used to take a team of highly experienced engineers/artists to solve.
Also someone leaked a project going on at National Instruments that really lit up my interest in this. I don't know what ever became of it, maybe nothing. but the idea was this. National instruments makes a product called "labview" which is a graphics based programming language whose architechute is based on "data flows" rather than procedural programming. in data flows, objects emitt and receive data asynchronously. when an object detects that all of its inputs are valid data it fires, does its computation (which might be procedural in itself, or it might be a hierarchy of data flow subroutines hidden inside the black box of the object) and emitts its results as they become valid. there are no "variables" per se just wires that distriuted emitted data flows to other waiting objects. the nice thing about this language is that its wonderful for instumentation and data collection, since you dont alwayd know when data will become available or in what order it will arrive from different sensors. Also there is no such thing as a syntax error, since its all graphical wiringing, no typiing, thus it is very safe for industrial control of dangerous instruments.
anyhow the idea was that each of these "objects" could be dynamically blown onto an FPGA. each would be a small enough computation that it would not have design complications like race conditions and all the objects would be self timed with asyncronous data flows.
THe current state of the art seems to be that no one is widely using the C-code or the Flow control languages. instead they are still using these hideous dynamical modelling, languages that dont meet the needs of programmers because they require to much knowledge of the hardware. I dont know why. maybe they are just too new.
However these things are not a panacea. For example, recently I went to the FPGA engineers here with a problem in molecular modeling of proteins. I wanted to see if they could put my fortran program onto an fpga chip. the could not, because 1) there was too much stored data required and 2) there was not enough room for the whole algorithm. So I thought well maybe they could put some of the slow steps on to the fpga chip. for example, given a list of 1000 atom coordinates, return all 1 million pair wise distances. This too proved incompatible for a different reason. When these fpga chips are connected to a computer system the bottleneck of getting data into and out of them is generally worse than that of a cpu (most commerical units are on PCMCIA slots or the PCI bus). thus the proposed calculation would be much faster on a ordinary microporcessor since most of the time is spent on reads and writes to memory.! there was however one way they could do it faster and that was to pipeline the calculations say 100 or 1000 fold deep. so that you ask for the answer for one array, and then go pick up the answer to the array you asked about 1000 arrays ago. this would have complicated my program too much to be useful.
these new FPGAs are thus exciting because they are getting so large and have so much onboard storage and fast internal busses that a lot of the problems I just mentioned may vanish.
My knowlege of this is about year out of date so I apologize if some of the things I said are not quite state of the art. But I suspect it reflects the commerially avialable world
Some drink at the fountain of knowledge. Others just gargle.
The way it works then is that a board is made with a normal CPU and an FPGA next to it. At program compile time a special compiler determines which algorithms would bog down the processor and develops a single-cycle hardware solution for the FPGA. That information then becomes part of the program binary so at load time the FPGA is so configured and when necessary it processes all that information leaving the CPU free. The FPGA can of course be reconfigured several times during the program, the point being to adapt as necessary. The time to reconfigure the FPGA is unimportant when running a long program doing scientific calculations and such.
It's a pretty nifty system. Some researches have working compilers and they have found 6-50x speedup with many operations. The program won't speed up that much of course, but it leaves the main CPU more free when running repetitive scientific and graphics programs.
You can find information in the IEEE archives or search google for 'adaptive computing'. It's a neat area with a lot of promise.
Stop the Slashdot Effect! Don't read the articles!
Precisely one of the reasons that I shriek in horror when I hear that some hardware was 'designed' by a clever software guy. What you describe "figure out the dynamic ... stable response" (a.k.a. timing analysis) is not done in debugging - it is part of the design from square one, and is part of proper hardware design practices.
The fact that FPGA's are "programmable" does not move their development into the domain of software engineers.
A whole spectrum of skills is required to do proper hardware design (being a good 'logician' is only one of them) and FPGA's are raw hardware not finished product like a motherboard. Timing and many other 'real-world' factors that must be considered bore the hell of many 'logicians', but are critical to a reliable design.
A frightening number of Rube Goldberg machines exist out there that were designed by people who know something of logic and nothing of hardware design. I've had to redesign several of these "works only under ideal conditions but it's brilliant" pieces of junk.
Before you dismiss me as a hardware snob, let me tell you that I have spent many years both sides of the street and have dedicated my most recent 10 years to learning the art of good software design (there was supposed to *cough* be a bigger future in it). Each requires a set of skills and abilities that do intersect, but many of which exist entirely outside of that intersection. The fact that "logic" is one of the intersecting skills does not make a good hardware designer good at software nor does it make a good software designer good at hardware.
Sigs are bad for your health.
I cannot remember the name of the project, but two years ago a Chinese group published a paper where they used a Xilinx fpga on a custom circuit plugged into a pc SDRAM slot. The idea was to limit the communication bottlenecks of other pc busses and also to present a simple way to communicate with the fpga. All in/output to/from the fpga was done with simple mmap() routines. Their test application was a DES code breaker that could run as fast as the the memory subsystem could take it. Exciting stuff. And it has to be said: I wish I had a Beowulf cluster with these.
I always get the shakes before a drop.
FPGAs are great for complex control logic in hardware. They can also be used for DSP functions. Using FPGAs for general purpose computing efficiently is difficult. You essentially start out with a 100:1 handicap against a commodity CPU (this comes from the amount of transitors per gate, using VHDL vs. custom design, and the way routing wires have to be universal). Programming such a beast becomes an exersize in finding huge amounts of parallelism that don't require memory accesses (FPGAs have limited RAM on board, but don't expect much, and you have to share accross 100s of functions). Supposedly there are C to hardware complilers out there, but I can't see Joe Software designer chugging out code that carefully checks to see how every line affects the clock rate (remember that: in software you have 10% of the code executed 90% of the time, in harware you have 100% of the code executed 100% of the time. The FPGA can only clock as fast as the slowest path). The economics are probably the worst problem. These sort of things are most likely to go into government or military instalations where the contract says hardware has to do (impossible thing), and be maintained for x years. The device gets made, the customer changes the requirements, spin repeat until it ships as an expensive system that a simple desktop could do by shipdate. If you build a beowulf to do something, with a little forsight you can turn around in two years and double the power. With an FPGA design, you may be able to buy an off-the-shelf board that has advanced (if the company is still in buisness, not guarenteed), but then you have to dig into the source and modify for the new chips. This gets expensive fast.