Star Bridge FPGA "HAL" More Than Just Hype

← Back to Stories (view on slashdot.org)

Star Bridge FPGA "HAL" More Than Just Hype

Posted by ryuzaki0 on Saturday February 15, 2003 @04:46AM from the smoke-clearing-mirror-disappearing dept.

Gregus writes "Though mentioned or discussed in previous /. articles, many folks (myself included) presumed that the promises of Star Bridge Systems were hype or a hoax. Well the good folks at NASA Langley Research Center have been making significant progress with this thing. They have more info and videos on their site, beyond the press release and pictures posted here last year. So it's certainly not just hype (though $26M for the latest model is a bit beyond the $1,000 PC target)."

8 of 120 comments (clear)

uhh, by Anonymous Coward · 2003-02-15 04:50 · Score: 5, Funny

uhh, so which link is the story?
$26M ...just a drop in the bucket by Superfarstucker · 2003-02-15 04:55 · Score: 5, Funny

26M? hah! i save that much every year pirating software and audio off the net.. puh-leez!
What is Star Systems? by $$$$$exyGal · 2003-02-15 04:55 · Score: 5, Insightful

Star Bridge Systems is the leading developer of truly parallel Hypercomputers.Our patent-pending hardware, software and Viva programming language are reinventing computer programmability to create the fastest, most versatile and energy-efficient computing systems available for solving many problems that require high computational density.
That's directly from their site. I wish the /. summary would have mentioned parallel hypercomputers. And note that when you search Google for "parallel hypercomputers", you only get get the one hit from Star Bridge Systems (and soon you'll get a hit for this comment on /. ;-)). No wonder people thought this was a hoax.
--sex

--
Very popular slashdot journal for adul
No magic -- sorry by Anonymous Coward · 2003-02-15 05:13 · Score: 5, Insightful

For a start: chip designers everywhere use FPGA:s to prototype their designs. No magic; they are reasonably fast (but not as fast as custom designed chips), and way more expensive. Having a large array of them would indeed make it possible to run DES at a frightening speed -- but so would a mass of standard computers. The sticking point is that the collection of FPGA:s emulating a standard CPU would be way slower for any given budget for CPU:s than a custom chip (like the PII, PIII or AMD K7) -- and way more expensive.

Think about it: both Intel and AMD (and everybody else) uses FPGA:s for prototyping their chips. If it was so much more efficient, why do they not release chips whith this technology already?

As for the reprogramming component part of this design: translating from low-level code to actual chip surface (which it still is very much about) is largely a manual even for very simple circuits, largely because the available chip-compiler technologies simply aren't up to the job.

Besides, have any of you thought about the context-switch penalty of a computer that will have to reprogram its' logic for every process :)
1. Re:No magic -- sorry by seanadams.com · 2003-02-15 05:52 · Score: 5, Insightful
  
  For a start: chip designers everywhere use FPGA:s to prototype their designs.
  
  Xilinx/Altera would not be in business if this were the only thing people used FPGAs for. There are some things you can do in an FPGA exceptionally well, eg pumping lots of data very quickly, and doing repetitive things like encryption, compression, and DSP functions. Generally speaking, the simpler the algorithm and the more it can be parallelized, the better it will work in hardware as compared to a CPU (yes, even a 4GHz pentium might be slower per $).
  
  As for the reprogramming component part of this design: translating from low-level code to actual chip surface (which it still is very much about) is largely a manual even for very simple circuits, largely because the available chip-compiler technologies simply aren't up to the job.
  
  I think it's a language problem more than a limitation of the synthethis/fitting tools. VHDL and Verilog are horrific. They are designed for coding circuits, not algorithms.
  
  Besides, have any of you thought about the context-switch penalty of a computer that will have to reprogram its' logic for every process
  
  With today's FPGAs this is a real problem. They're designed to be loaded once, when the system starts up. What we neeed is an FPGA that can store several "pages" of configurations, and switch between them rapidly. The config would need to be writeable over a very fast interface of course.
This is the future of High Performance Computing by Dolphinzilla · 2003-02-15 05:32 · Score: 5, Interesting

We started using FPGA's in our HPC designs where I work several years ago - the designs are faster, more reliable, and quicker to design. StarBridges graphical development environment is a lot like another product sold by Anapolis Micro called Corefire.
Corefire is a java based graphical (iconic)development environment for Xilinx FPGA's. It is like anything else though sometimes programming in VHDL will be a better choice, it depends on the complexity of the design and the desired end result. But all in all we probably saved at least 6 man-months of design time using Corefire.
More information by olafo · 2003-02-15 05:38 · Score: 5, Informative

More technical information is found in MAPLD Paper D1 and other reports. NASA Huntsville, NSA, USAF (Eglin), University of South Carolina, George Washington University, George Mason University, San Diego Supercomputer Center, North Carolina A&T and others have StarBridge Hypercomputers they are exploring for diverse applications. The latest StarBridge HC contains Xilinx FPFAs with 6 million gates compared to the earlier HAL-Jr with only 82,000 gates. Costs are nowhere near $26 Million. NASA spent approx 50K for two StarBridge Systems.
FPGA experiences by goombah99 · 2003-02-15 05:48 · Score: 5, Informative

I've brushed up against reconfigurable computing engineers in various applications I've had over the years. The last one was for trying to process laser radar returns coming in at gigabits per minute so we could do real time 3-D chemical spectoscopy of the atmosphere at long range. The problem with conventional hardware was the busses were too slow and the data rate too fast too cache, and too much to archive on disk. you could not effieicently break the task into multiple CPU since just transfering the information from one memory system to the next would become the bottleneck, breaking the system.
FPGAs worked pretty well here because they could handle the fire hose data rate from front to back. Their final output was a small nuumber of processed bytes so that could then go to a normal computer for display and storage.
the problems the engnieers had was two fold. first in the early chips there were barely enough gates to do the job. and in the later ones form xylinx there were plenty of transistors but they were really hard to design properly. the systems got into race conditions were you had to use software to figure out the dynamic proerties fo the chip to see if two signals would arrive at the next gate in time to produce a stable response. you had to worry where on the chip two signals were coming from. it was ugly and either you accepted instability or failed prootypes or you put in extra gates to handle synchronization--which slowed the system down, and caused you to waste precious gates.
still my impression at the time was WOW. here is something that is going to work, its just a matter of getting better hardware compilers. Since then Los Alamos has written a C compiler that compiles C to hardware and takes into account all these details it used to take a team of highly experienced engineers/artists to solve.
Also someone leaked a project going on at National Instruments that really lit up my interest in this. I don't know what ever became of it, maybe nothing. but the idea was this. National instruments makes a product called "labview" which is a graphics based programming language whose architechute is based on "data flows" rather than procedural programming. in data flows, objects emitt and receive data asynchronously. when an object detects that all of its inputs are valid data it fires, does its computation (which might be procedural in itself, or it might be a hierarchy of data flow subroutines hidden inside the black box of the object) and emitts its results as they become valid. there are no "variables" per se just wires that distriuted emitted data flows to other waiting objects. the nice thing about this language is that its wonderful for instumentation and data collection, since you dont alwayd know when data will become available or in what order it will arrive from different sensors. Also there is no such thing as a syntax error, since its all graphical wiringing, no typiing, thus it is very safe for industrial control of dangerous instruments.
anyhow the idea was that each of these "objects" could be dynamically blown onto an FPGA. each would be a small enough computation that it would not have design complications like race conditions and all the objects would be self timed with asyncronous data flows.
THe current state of the art seems to be that no one is widely using the C-code or the Flow control languages. instead they are still using these hideous dynamical modelling, languages that dont meet the needs of programmers because they require to much knowledge of the hardware. I dont know why. maybe they are just too new.
However these things are not a panacea. For example, recently I went to the FPGA engineers here with a problem in molecular modeling of proteins. I wanted to see if they could put my fortran program onto an fpga chip. the could not, because 1) there was too much stored data required and 2) there was not enough room for the whole algorithm. So I thought well maybe they could put some of the slow steps on to the fpga chip. for example, given a list of 1000 atom coordinates, return all 1 million pair wise distances. This too proved incompatible for a different reason. When these fpga chips are connected to a computer system the bottleneck of getting data into and out of them is generally worse than that of a cpu (most commerical units are on PCMCIA slots or the PCI bus). thus the proposed calculation would be much faster on a ordinary microporcessor since most of the time is spent on reads and writes to memory.! there was however one way they could do it faster and that was to pipeline the calculations say 100 or 1000 fold deep. so that you ask for the answer for one array, and then go pick up the answer to the array you asked about 1000 arrays ago. this would have complicated my program too much to be useful.
these new FPGAs are thus exciting because they are getting so large and have so much onboard storage and fast internal busses that a lot of the problems I just mentioned may vanish.
My knowlege of this is about year out of date so I apologize if some of the things I said are not quite state of the art. But I suspect it reflects the commerially avialable world

--
Some drink at the fountain of knowledge. Others just gargle.