End of The Von Neumann Computing Age?
olafo writes "Three recent Forbes articles:
Chipping Away, Flexible Flyers and Super-Cheap Supercomputers cite attractive alternatives to traditional Von Neumann computers and microprocessors. One even mentions we're approaching the end of the Von Neumann age and the beginning of a new Reconfigurable computing age. Are we ready?"
This should help
best web host ever
Von Neumann means a processor hooked up to a single memory that contains both the program and the data, executing instructions one at a time in a sequence.
Compare this to the Harvard architecture used on some embedded processors: a processor hooked up to two separate memories, one containing the program, and the other containing the data. This is useful when you have your program in an EEPROM and your data in a little static RAM. Two types of memories naturally fit into a Harvard architecture, though it's simple enough to do the same thing with some memory mapping circuits.
If tits were wings it'd be flying around.
A terrible Karma Whore opportunity, but from FOLDOC..
/jon von noy'mahn/ Born 1903-12-28, died 1957-02-08.
f _fame/vonneuma.htm)}.
John von Neumann
A Hungarian-born mathematician who did pioneering work in
quantum physics and computer science.
While serving on the BRL Scientific Advisory Committee, von
Neumann joined the developers of {ENIAC} and made some
critical contributions. In 1947, while working on the design
for the successor machine, {EDVAC}, von Neumann realized that
ENIAC's lack of a centralized control unit could be overcome
to obtain a rudimentary stored program computer. He also
proposed the {fetch-execute cycle}.
{(http://www.sis.pitt.edu/~mbsclass/is2000/hall_o
{(http://ei.cs.vt.edu/~history/VonNeumann.html)}.
{(http://ftp.arl.mil/~mike/comphist/54nord/)}.
--
Basically a von Neumann machine takes instructions in serial and process them one by one, altering the course of it's instruction flow based upon the instructions preceeding it (i.e. normally it carries on to the next instruction except for jumps and things like that). Nearly all current (All? can any one suggest any others in frequent use) computers are Von Neumann architectures.
For those of you skeptics (like myself when I first saw the articles) and for those that didn't RTFA:
Allan Snavely, a computer scientist at the University of California at San Diego Supercomputer Center, has been using a Star Bridge machine for about a year. He says he originally contacted Star Bridge because he suspected the company was pulling a hoax. "I thought I might expose some fraud," he says.
But after meeting with Gilson and seeing a machine run, he changed his mind. "They're not hoaxers," he says. "As I came to understand the technical side I thought it had a lot of potential. After talking to Kent Gilson I found he was very technically savvy."
Silicon Graphics has also asked Star Bridge to send along a copy of its hardware and software. The $1.3 billion (fiscal-year 2002 sales) supercomputer maker wants to explore ways to make a Star Bridge system work with a Silicon Graphics machine.
Over the past two years Star Bridge has sold about a dozen prototype machines based on an earlier design to the Air Force, the National Security Agency and the National Aeronautics and Space Administration, among others. It has also sold seven of the new models.
Olaf Storaasli, a senior research scientist at NASA's Langley Research Center in Hampton, Va., has been using Star Bridge machines for two years and says they are very fast but not yet ready to handle production work at NASA. "It's really a far-out research machine," he says. "It's more about what's coming in the future. I would not consider it a production machine."
One problem, Storaasli says, is that you can't take programs that run on NASA's Cray (nasdaq: CRAY - news - people ) supercomputers and make them run on a Star Bridge machine. Still, he says, "This is a real breakthrough."
Von Neumann was smart enough that there is more than one thing named after him. A Von Neumann machine is a self-replicator. A Von Neumann architecture is a computer architecture where programs and data are stored in the same manner.
Sometimes the latter is also referred to as a Von Neumann machine.
Tom Swiss | the infamous tms | my blog
You cannot wash away blood with blood
Also see this thread.
I could be wrong, I don't speak freaky-deaky dutch.
-- (Score:i, Imaginary)
Even hyperthreading is only a minor improvement in parallelism, exchanging one instruction pointer for a small number (2? 4?). Hardly a different architecture.
Some implementations add a step between 1 and 2 that says "increment the program counter" and leave jumps up to specific instructions. Others associate program counter changes with every instruction (i.e. jumps go to somewhere specific, every other instruction also implies PC++.)
There's nothing more to Von Neumann machines. They are unrelated to finite state machines or Turing machines, except that every Von Neuman machine can be modelled as a Turing machine. The difference is that a Turing machine is a mathematical abstraction, whereas Von Neuman machines are an architecture for implementing them.
Whoo hoo. And yes, I am a computer scientist. Or maybe a cogigrex.
IP is just rude.
Is there any torture so subl
Okay, no. FPGAs are NOT going to completely change computing.
.o file.)
First, you have to understand what they are: basically an FPGA is an SRAM core arranged in a grid, with a layer of logic cells (Configurable Logic Blocks, in Xilinx's parlance) layered on top. These logic cells consist of basically function generators that use the data in the underlying SRAM to configure their outputs. Typically they are used as look-up tables (LUTs) -- basically truth tables that can represent arbitrary logic functions -- or as shift registers, or as memories. On top of THAT layer is an interconnection layer used for connecting CLBs in useful ways. The FPGA is re-configured by loading the underlying SRAM with a fresh bitmap image, and rebuilding connections in the routing fabric layer.
You write for FPGAs the same way you build ASICs. You use the same languages (Verilog, VHDL) and sometimes the same toolchain. The point being: this is HARD. Trust me, I've been doing it. Verilog is damn cool, but remember that you're still building this stuff almost gate-by-gate.
There are a number of tools out there that do things like translate 3GL languages (such as Xilinx's Forge tool for Java, or Celoxica's DK1 suite for Handel-C) to an HDL like Verilog. Other tools like BYU's JHDL are essentially scripting frameworks for generating parameterized designs that can be dumped directly into netlist (roughly equivalent to a
My job for the past several months has been to obtain and evaluate these tools. I can tell you that these tools are not there yet.
So what do you use FPGAs for? Well, for the next 5 years, likely one of two things: either really cheap supercomputers (which is what we are working on) or as a "3D Graphics card play." The supercomputing play is obvious, the the other one bears explanation.
Anything you can think of goes faster if you implement it in hardware. 3D graphics is a great example: most cards today consist of a bunch of matrix multipliers plus some memory for the framebuffer, and a bunch of convenience operations that you do in hardware as well (like textures and lighting and so on.) Because it's in hardware, it's way faster than anything you could do on a general purpose processor.
Now, the problem is that hardware means ASICs (until recently.) ASICs are only cheap in large volumes. Thus, for applications that are not mass-market (like graphics cards are) it is not practical to build out an industry building hardware accelerators for them.
That's where FPGAs come in. FPGAs cost more per ASIC, but less than ASIC in small volumes. This suddenly makes it practical to make custom hardware accelerators for almost anything you can think of.
This is also true of supercomputing: supercomputers are still general-purpose, just not THAT general-purpose. Your algorithm still benefits when you can just reduce it to logic and load it onto a chip. You might only be running at 200MHz, but when you get a full answer every clock cycle, you suddenly do a lot better than when you get an answer every 2000 cycles on your 2GHz processor.
So to get back on topic, where will we see FPGAs? Well, you might expect to see an FPGA appear alongside the CPU on every desktop made in a few years; programs that have a routine that needs hardware acceleration can just make use of it. (Think PlayStation 4, here.)
You might also see things like PDAs come with FPGA chips: if your car's engine dies, you can just download (off your wireless net which will be ubiqutious *cough*) the diagnostic routine for you car and load it into that FPGA and have your car tell you what's wrong.
Aerospace companies will love them, too. Whoops, didn't catch that unit conversion bug in your satellite firmware before launch? Well, just reprogram the FPGA! No need to send up an astronaut to swap out an ASIC or a board.
What you're NOT going to see is every application ported to FPGAs willy-nilly, because like I said, this stuff is not easy. I'm coming a
You're confusing "Von Neumann device" with "Von Neumann {computer,architecture}", which is an easy mistake to make.
VN devices are what you said they are, and no, they don't exist yet.
A VN architecture (or "stored-program architecture") is one where the code for the program gets loaded into the same memory as the data for the program, i.e., essentially everything that you use today. This was in contrast to earlier architectures where the memory was used to store only runtime data, and the code was read in from, e.g., punch cards. A separated architecture still has its uses today, but they're not very common nor visible.
Turing machines are an abstract idea; all the current stuff are implementations of Turing machines. There is a difference but most people don't care.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
It depends on what you're trying to do.
/r + (q * p); or something, because you have to synchronize all the data. All this, of course, is just for a calculation: imagine the difficulty when you are waiting for signals on off-chip pins, when you don't even know how long you're going to be waiting. Also consider how you handle cases where you have to talk to memory: you usually have to write your own memory CONTROLLER, or at least use someone else's, meaning you actually have to worry about row and column strobes, whether it's DDR or not, and so on.
:)
Usually when you are trying to compile something down to logic gates, you have to handle instruction scheduling. For example, in any conceivable situation, division always takes longer than addition. So, you have to make sure that while you're waiting for a division to complete all the rest of your data doesn't evaporate.
This isn't like a general purpose processor == there are no persistent registers here. Use it or lose it. So you have to stick in tons of shift registers everywhere, as pipeline delayers.
So it's not as simple as just saying res = (a + b)
If you've done multithreading programming and understand those difficulties, then take that and multiply the difficulty by a couple times, and you just about have it.
All that said, though, you're right: it shouldn't be that hard. If all you want to do is use C to express a calculation, that is fairly easy to boil down to a Verilog or VHDL module.
The problem is that most of the 3GL-to-HDL vendors try and boil the whole ocean. They want you to use nothing but their tool, and never have to look at Verilog. That is where things really start to break down.
An example of this done mostly RIGHT is a company whose name I can't remember. (AccelChip?) They make a product that takes Matlab code and reduces it to hardware. That's easier in a lot of ways, because Matlab is really all about simply allowing you to easily express a mathematical system or problem. There aren't all these control flow, I/O, and other random effects. My understanding is that this Matlab-to-VHDL tool works quite well.
So, it all depends on what you want to do with the FPGA.
As other replies mention, a Von Neumann machine is a conceptual computer which is somewhat more realistic than a Turing machine (although equivalent in the problems it can solve). But why is a relentless science-fiction monster named after a computational theorist?
/bin/cp ~/cp2") The idea of a "Von Neumann device" extends this concept out of the digital world and posits physical machinery which is able to construct machines very similar to itself.
The distinguishing characteristic of a Von Neumann machine is that code and data are treated the same. Both are stored in the same memory, which seems natural to a modern user, but was revolutionary back when it was introduced.
One might say that Von Neumann invented the idea of "software". Pre-Von Neumann computer programmers spent days clipping relays into breadboards. To change the program, you had to rebuild the machine.
But with executable code actually stored inside the pattern of magnetic switches, it's as if the machine has the ablity to rebuild itself when needed. By running compiler software, for instance, is as if the computer is enhancing itself to extend or optimize it's features. The "machine" gets more complex. Likewise, virus programs seem to be replicating small bits of machinery.
So a Von Neumann computer, in a way, is a machine which can modify it's own functions. Von Neumann software are machines which can edit, delete, or replicate themselves. ("cp
Just like a computer virus (or worm, or mere fork-bomb) could expand to take up all your memory, so could a Von Neuman robot replicate to eventually use up all the metals and silicates on a planet (or even galaxy).
All 3 articles are FREE! Try thousands, not 200Gs. /.): "The result is that a WinCom server with a few $2,000 FPGAs can blow the doors off a Sun or an Intel-based machine. "We're 50 to 300 times faster."
An interesting quote regarding a FPGA web server application (in case you didn't get your free login ID just like
..is where this sort of stuff really belongs.
A family member is working here, and the biggest markets they have lined up for their new design are the mobile-phone vendors, and image processing. They aren't interested at all to pitch it towards general-purpose computing.
Interestingly enough though, the software-defined-radio teams have been eyeing the product with drool in their mouth ever since it was demonstrated. Said family member remembers trade conventions the company's been to, where the SDR teams showed up and literally begged for a test chip to play with.
--
Hw vs. Sw - which is more difficult to "doodle" with?
Me also having a software background allowed me to relate to your story a little bit. However, our experiences have differed I think, cause in all honesty, judging from the *hobbying* I've done, software is *far* more complicated than hardware, reason being the volume of logic involved. As long as your ambitions are not to exceed the next Intel design, doing your own VHDL design is a fun, enjoyable, well overviewable and especially *rewarding* endeavour!
Designing stuff
In a hardware design, your design = your code (want a schematic, do it in a schematic! -- and not like UML 'roundtrip' engineering, no, the real thing), with software this is rarely the case. Furthermore, because a hardware design has a very focussed purpose, its more streamlined, software tends to need all bells and whistles you can throw at it to further complicate the design and thus introduce much more bugs - with hardware, things *typically* stay reasonably elegant since the way you like to think about it, is the way you'll be implementing it.The only big problem I encountered with coding FPGA's is the *enormous* difficulty in Debugging your code. Many linuxers that are "printf" inclined to debug will have to learn that a bunch of leds is all you got when hobbying. (The "free" tools for signal simulation is just a royal pain -- I didn't get one to work due to the "free" license key I needed to install). This involved a _lot_ of theorizing on my end as to why it didn't work. (Eg. driving a vga signal, "why is my screen flickering" is the only info you've got (but hey, it's better info "why is my screen smoking?", right?)).
Anyway, Jolly good fun, I can recommend it to any software engineer - wouldn't call it the next best personal development step from Java but if you know your way around CPU's and can recognize Pascal type languages, VHDL ain't that hard.
Books Some books I found useful in my endeavours :
VHDL for Designers, fun book, good read, introduces VHDL as a language and how to write your stuff. Also relates it to the various VHDL "compilers" so you know what works where.
ASIC Handbook, little book, handy overview of process / project management, if you're inclined to go the asic route.
Art of Electronics, you'll need to understand what happens on your circuit board, and be able to read diagrams.
and lots and lots of datasheets, but you can get those off the net!
Great fun, and not as hard as it sounds - buy a board, download the Foundation kit, and doodle!
You're talking about Maurice Wilkes, not Von Neumann.
Seastead this.
Quick google searches reveal ...
... in 1976 (cost 8M$)
:
Here : http://www.thocp.net/hardware/cray_1.htm
Top speed 133 MFLOPS
And from : http://www.theregister.co.uk/content/1/14840.html
CPU
PIII 1GHz: CPU: 2694 MIPS, FPU: 1333 MFLOPS
P4 1.5GHz: CPU: 2866 MIPS, FPU: 882 MFLOPS
Athlon 1GHz: CPU: 3111 MIPS, FPU: 1395 MFLOPS
Snooping around more
SGI Origin2000: 114 MFlops
Macintosh G3 ZIF/400: 93 MFlops
Macintosh G3/333: 77 MFlops
Intel Pentium II/450: 72 MFlops
Macintosh G3/300: 71 MFlops
Macintosh G3/266: 64 MFlops
Cray T3E-900: 63 MFlops
IBM SP2: 59 MFlops
iMac/233: 56 MFlops
Intel Pentium II/300: 48 MFlops
Intel Pentium Pro/200: 36 MFlops
Cray T3D: 17 MFlops
Of course, this is all rough - and depends on the software, memory etc.
If you're running a 3D-accelerated PC game or modelling application, the majority of your computer's FLOPS are already consumed by a non Von Neumann computing device.
.13 micron CPU. Such a system would be VERY easy to program, a couple orders of magnitude more so than an FPGA. So even though it wouldn't have as much theoretical computing power as an FPGA, massively parallel CPU's are likely to win out because they have the best cost/performance when you factor in development cost.
For better or worse, most of the PlayStation2's computing power is locked up in a non Von Neumann architecture.
So the evolution of computing to non Von Neuman architectures isn't so much news as a gradual shift that began about 5 years ago with 3dfx, and is really starting to happen large-scale right now.
The justification for FPGA's in consumer computing devices could be seen as a generalization of the rationale behind 3D accelerators: they bring you the ability to get a 10X-100X speedup in certain key pieces of code that are inherently very parallel and have very predictable memory access patterns.
I think the timeframe for mainstream FPGA style devices is quite far off, though. They need to evolve a lot before they'll be able to beat the combination of a Von Neumann CPU augumented with several usage-specific non Von Neumann coprocessors (the GPU, hardware TCP/IP acceleration, hardware sound...)
Here are the major issues:
- You'll need a lot more local memory than these devices have now -- there is a very limited set of useful stuff you can compute given a 32K buffer (a la PS2) and significant setup overhead.
- The big lesson from CPU's (and I expect from GPU's in the next few years) is that things REALLY flourish once you have virtualization of all resources, with a cache hierarchy extending from registers to L1 to L2 to DRAM to hard disk. For virtualization to make sense with FPGA's, Star Bridge's quoted reprogram times (40 msec) would need to improve by about 10,000X. Without this, you can really only run one task at a time, and that task can only have a fixed number of modules that use the FPGA.
Even then, it's not clear whether the FPGA's will be able to compete with massively parallel CPU's. In 3 more process generations, you should be able to put 8 Pentium 4 class CPU's on a chip, each running at over 10 GHz, at the same cost as current