Slashdot Mirror


Virginia Tech Supercomputer Up To 12.25 Teraflops

gonknet writes "According to CNET news and various other news outlets, the 1150-node Hokie supercomputer rebuilt with new 2.3 GHz Xserves now runs at 12.25 Teraflops. The computer, the fastest computer owned by an academic institution, should still be in the top 5 when the new rankings come out in November."

24 of 215 comments (clear)

  1. Speed at top by luvirini · · Score: 4, Interesting

    Reflecting on the comment: "hould still be in the top 5 when the new rankings come out in November." There seems to be a serious push for multiprosessor systems, currently the ranking seem to consist of a couple of stars, few big ones(this computer among them) and a huge group of third category, and then the "used to be great" computers. But from my reading of the trends seems that there will be more and more crowding at near the top, so I expect the second category to be much larger, with much smaller differences.

  2. Density by GerbilSocks · · Score: 5, Interesting
    VT could theoretically pack in 4x the number of nodes in the same space that occupied the original System X. Could we be looking at at least a 50 TFlop (minus 10% overhead) supercomputer with 8,800 cluster nodes?

    If that were feasible, you could be looking at toppling Earth Simulator at a fraction of the cost.

  3. "Dick factor" aside by ceeam · · Score: 3, Interesting

    Would be interesting to know exactly what stuff do these machines do? Maybe they would even be able to share some code so that people can fiddle around with it optimizing (should be fun).

  4. So compare it to...... by ericdano · · Score: 3, Interesting
    The school said it spent about $600,000 to rebuild the system and add the additional nodes. The original cost of System X was $5.2 million.

    Compare it to this new Cray system. Bang for the buck would make the Apple system better.

    --
    It's either on the beat or off the beat, it's that easy.
    I moderate therefore I rule!
    --
  5. Re:2.3GHz? by Ford+Prefect · · Score: 4, Interesting
    But the XServers come at 2.0GHz, with the desktop powermacs at 2.5GHz. Is this a mistake?

    From the article:
    Apple said last week that the 2.3GHz machines were a one-off deal for Virginia Tech and not something the company plans to announce for broader consumption anytime soon.
    What I really want to know is what they do with the old machines. The articles speaks of the cluster being 'upgraded' - are the older G5s replaced, or do they just become part of the new cluster?

    Still, I suppose there's one or two unwanted G5s - anyone want to send me a couple? :-)
    --
    Tedious Bloggy Stuff - hooray?
  6. Re:hrm by tmj0001 · · Score: 5, Interesting

    Hans Moravec's book "Robot" suggests that 100 teraflops is about the level required for human intelligence. So we are up to 10% of his target. But human intelligence still seems very far away, so either he has badly underestimated, or our collective programming skills need significant improvement.

  7. Re:Old stuff... by Anonymous Coward · · Score: 5, Interesting

    If you're referring to the old G5 Powermacs used in the original System X...they were sold. I bought one!

  8. Re:Crays... by Coryoth · · Score: 4, Interesting

    are not designed for the same type of work as clusters. If a probably is not effeciently parallizable and requires shared memory then a Cray is the only feasible option A Cray is not a cluster. It's like comparing mph for a sports car and truck: the car is faster but they are meant for different types of loads.

    To be fair to the original poster, the Cray system he was referencing is a cluster system. Then again, its a cluster system with very impressive interconnects for which System X just isn't comparable (ie. The Cray system will scale far far better), not to mention the Cray software (UNICOS, CRMS, SFW), and the fact that the Cray system is an "out of the box" solution. So you are right, there is no comparison.

    Jedidiah.

  9. Re:hrm by TimothyTimothyTimoth · · Score: 5, Interesting

    I think Morevec's method of simulating human intelligence involves modelling a scanned copy of the human brain, in real time at a neuronal level. It would be similar to modelling the global weather system, a software capability we already have. Current neuroscience would expect this model to be functionally equivalent to a human mind in terms of matching inputs and outputs. As an aside, I know that Ray Kurzweil has I much higher required estimated of a 20 petaflop (20,000 teraflop) computer, based on more conservative assumptions. 20 petaflops is due around 2009/10 under Moore's law. (And I for one offer an early welcome to our expected new AI overlords ...)

    --
    It doesn't matter which ape activates the Monolith
  10. Re:hrm by SnowZero · · Score: 4, Interesting

    I actually asked Hans a similar question at a talk he gave a while back, and he didn't really answer it, to my disappointment. My question was that "In nature the algorithm and computer were evolved together, so we'd expect them to be at a similar level of advancement. So, even if we get a computer as fast as a human, it might it not be nearly as smart since our programs do not use it efficiently enough?" In other words, Moore's law isn't helping us write better software (in some ways quite the contrary).

    I'm a robotic software researcher, so this notion really affects me. IMO Software will lag well behind hardware, since it doesn't scale out nearly as well. Representation is of course a huge problem I won't even try to touch... But rest assured lots of people are working on all these things. Btw, It also doesn't help that CPU designs aren't even trying to make AI-style algorithms fast, but we can't blame manufacterers for that util there is demonstrable money to be made.

  11. Re:hrm by TimothyTimothyTimoth · · Score: 4, Interesting
    By the way, IBM BlueGene/L is going to produce 360 teraflops by end 2004, so if the report of Moravec's estimate is correct, and he is correct, that AI Overlord welcome could be pretty soon.

    (Although I don't believe brain scanning quite hits the resolution mark required yet.)

    --
    It doesn't matter which ape activates the Monolith
  12. Re:hrm by RKBA · · Score: 3, Interesting

    His estimate was probably based on the common, and incorrect, belief that neurons are purely digital.

  13. Re:and yet... by Short+Circuit · · Score: 2, Interesting

    That's a big RAID Array...

  14. Re:hrm by Deorus · · Score: 2, Interesting

    I think the difference between human and computer intelligence is that our software (conscious) is able to hard-wire the hardware (unconscious). We may not be able to consciously perform certain tasks such as floating point calculations because our software lacks low level access, but we can hard-wire our hardware for those tasks, this is why our unconscious is so quick and accurate when trained to recognize and respond to specific patterns regardless of their complexity.

  15. Re:hrm by segmond · · Score: 2, Interesting

    I don't think CPU should be designed for AI-style algorithms, when the said algorithms have not been proven. Assume we finally suceed in implementing the Holy Grail of AI right, then we can seek out ways to optimize and make it fast, thus custom CPUs will come in. Right now, most of the algorithms are a joke.

    --
    ------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
  16. What is a supercomputer ? by Animaether · · Score: 3, Interesting
    I'm curious as to the answer to the question (What is a supercomputer ?).

    The reason is this.. more and more of these 'supercomputer' entries appear to be many machines hooked up together, possibly doing a distributed calculation.

    However, would projects such as SETI, GRID, and UD qualify with their many thousands of computers all hooked up and performing a distributed calculation ?

    If not, then what about the WETA/Pixar/ILM/Digital Domain/Blur/You-name-it renderfarms ? Any one machine on those renderfarms could be put to use for only a single purpose: to render a movie sequence. Any one machine could be working on a single frame of that sequence. Does that count ?

    I seem to think more and more that the answer is 'no', from my perspective. They mostly appear to me as rather simple computers (very often not even the top-of-the-line in their own class), with the only thing going for them that there are many of them.

    The definition of supercomputer (thanks Google, and by linkage dictionary.reference.com ) is :
    A mainframe computer that is among the largest, fastest, or most powerful of those available at a given time.


    And for mainframe :
    A large powerful computer, often serving many connected terminals and usually used by large complex organizations.
    The central processing unit of a computer exclusive of peripheral and remote devices.


    Doesn't the above imply that a supercomputer should really be just a single computer, and not a network or cluster of many computers ?
    ( The mention of 'terminals' does not mean they're nodes. Terminals are, after all, chiefly CPU-less devices intended for data entry and display only. They are not part of the mainframe's computing capabilities. )

    If the above holds true, then what is *really* the world's top 3 of supercomputers ? I.e. which aren't 'simply' a cluster of nodes.

    Any mistakes in the above write-up/though process ? Please do point them out :)
    1. Re:What is a supercomputer ? by log0n · · Score: 2, Interesting

      [quote]Doesn't the above imply that a supercomputer should really be just a single computer, and not a network or cluster of many computers ?[/quote]

      But if all of the networked/clustered computers are all working on the same task with information flowing between nodes dependant on other nodes processing , doesn't that make them all effectively one large computer?

      A renderfarm is similar in many ways to a supercomputer, but I wouldn't think of it as one. Renderfarm nodes generally work on a specific task that is assigned to them. They can be of a larger over all project (like rendering for a film), but really they only process what is given to them then spit the info back. There's a queue manager that sends out tasks, but very little of the information that gets processed by a node is dependant on information that is in use by another node. A renderfarm basically gives out raw processing for a task when requested, it doesn't do much beyond that.

      You can still have multiple terminals for data in/out, and in the VT case these are definitely systems that are exclusive of peripheral devices (remote device doesn't make sense - a connected terminal is a remote device).

      I do think that definitions have blurred from what they used to be thanks to improving technology, but I do think that the generalities of what they represent are still valid.

      $.02, etc

  17. Re:hrm by diersing · · Score: 4, Interesting

    I have a question from a casual observer who comes across this Hokie machine and the top 500 list every now and then. What is it these computers do?

    Hearing it referenced in terms of AI helps, but is that the only purpose for a research facility to build one of these mammoths? Are there practical applications for the business world (other then the readily available (read commercial) clustered data warehousing)?

    I'm not trolling, just curious.

  18. don't forget... by Geek_3.3 · · Score: 2, Interesting

    (those that go to despair.com will recognize this) that "You can do anything you set your mind to when you have vision, determination, and an endless supply of expendable labor." Point being, I'm sure having essentially free labor (sans pizza, of course... ;-) might have cut the price down just a little bit too...

    Not to poo poo their efforts, but the whole system was essentially a 'loss-leader' for future supercomputers projects using the G5's and Xserve....

  19. Actually, VT will be #8 this time around by daveschroeder · · Score: 4, Interesting

    Prof. Jack Dongarra of UTK is the keeper of the official list in the interim between the twice yearly Top 500 lists:

    http://www.netlib.org/benchmark/performance.pdf (see page 54)

    There have been some new entries, including IBM's BlueGene/L, at 36Tflops, finally displacing Japan's Earth Simulator, and a couple other new entries in the top 5.

    Here's just the top 16 as of 10/25/04:

    http://das.doit.wisc.edu/misc/top500.jpg

    No matter what anyone says, Virginia Tech pulled an absolute coup when they appeared on the list at the end of 2003: no one will likely EVER be able to be #3 on the Top 500 list for a mere US$5.2M...even if the original cluster didn't perform much, or any, "real" work, the publicity and recognition that came of it was absolutely more than worth it.

    Also interesting is that there is also a non-Apple PowerPC 970 entry in the top 10, using IBM's JS20 blades...

  20. Re:hrm by benhocking · · Score: 4, Interesting

    Actually, it's not quite that simple. As someone whose research is in modeling the hippocampal region CA3 (about 2.5 million neurons in humans, 250k neurons in rats), I can tell you that the connectivity of the system is a very important variable. And there is still much we don't know about the connectivity of the human brain. Furthermore, there are hundreds of different types of neurons in the human brain. Why so many different types if only 2 or 3 would do? Seems evolution took an inefficient path - unless, as is probably the case, the differences in the neuron types are crucial for the human computer to work the way it does. Granted, some differences might be due to speed or energy efficiencies which are not absolutely critical for early stages, but I suspect that many differences have to do with the software (or wetware in this case) that makes us intelligent.

    After we've solved that minor problem, I think teaching the system will be relatively trivial. I.e., if we understand the wetware enough to reconstruct it, we most likely understand how its inputs relate to our inputs, etc., and we could teach it much the same as we teach a human child. Of course, we might also figure out a better way to teach it, and in so doing we might even find a better way to teach human children. (Some of our research has recreated certain known best learning strategies, it is probably only a matter of time before simulators disover a better one!)

    --
    Ben Hocking
    Need a professional organizer?
  21. Re:What is the point? by daveschroeder · · Score: 3, Interesting

    Rich guys that buy Ferraris and never drive them don't get untold amounts of recognition, publicity, free advertising, news articles, and the capability to catapult themselves to the forefront of the supercomputing community overnight for a paltry sum of money, thus attracting millions of dollars of additional funding and grants to build clusters that WILL be doing real work, such as the one we're talking about now, and the several additional clusters they plan to build in the future, not to mention the benefit of proving that a new architecture, interconnect, and OS will perform well as a supercomputer, allowing more choice, competition, and innovation to enter the scene, which ultimately results in more and better choices for everyone.

    Does that answer your question?

  22. Re:hrm by ca1v1n · · Score: 3, Interesting

    I'm not really sure what you mean that they haven't been proven. In the sense that they don't give the best answer all the time, this much is obvious. That's why we call it artificial intelligence instead of algorithmics. That said, we know quite well that they work. Most adaptive spam filters are based on Bayesian networks. The best of these are better than humans at identifying spam. We don't typically run the best because the computational load is far too high. Bayesian networks have a delightfully simple evaluation procedure that is basically glorified matrix multiplication. Neural networks are a little more complicated, but not by a whole lot. Recall a recent development that used a neural network inside an 802.11 driver to predictively avoid collisions to improve total network throughput in dense environments. It doesn't reduce collisions to 0, as that would require clairvoyance, but it does a good job. You didn't hear about this 5 years ago because putting a neural net inside an 802.11 driver without killing performance to both network and computer is difficult, particularly without processor instructions dedicated to the task.

    It's true that designing a CPU to *be* a neural or bayesian network is infeasible, but that doesn't mean we can't add instructions to accelerate their evaluation. The evaluation and update of a neural net, traditional or biologically modeled, is a rather simple algorithmic process, though people who have worked with such simulations (see Ben Hocking's post above, he was my quite capable AI TA) will tell you that they make rather obscene optimizations to make it run reasonably fast. I'm talking about things that might sound familiar to graphics people, like removing all multiplications from a program that's supposed to be doing them more than all other operations combined. It's a particularly good candidate for SIMD instructions. Most large neural nets are sparsely connected, so even if your net is substantially larger than your cache, you can beat that with prefetching. Threshold conditional addition is an example of something that can be done very quickly in hardware, and is much more of a pain to code and optimize in software.

    If you prefer RISC to CISC, recall that even the original SPARC had special DSP instructions. Putting the sigmoid function and arctan on silicon is really not all that outrageous.

  23. Re:hrm by ca1v1n · · Score: 2, Interesting

    But their aggregate behavior is quite easily computable. In the human brain, 70% of neuron firings randomly fail to register in their successor. This not only makes our behavior somewhat random, but also implies that there's quite a bit of redundancy and that our brains operate on aggregate behavior of a large neural net, rather than precise behavior of a small one, otherwise we'd be completely unpredictable, rather than just mostly unpredictable. While it's true that you can't model a human brain reliably with a computer, it's also true that you can't even model it reliably with the same human brain. Generally speaking, any simulation that is as good as reality is good enough, even if reality isn't really right.