Slashdot Mirror


User: joib

joib's activity in the archive.

Stories
0
Comments
928
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 928

  1. Not to mention that BG beats BigMac in flops/$ on Earth Simulator, G5 Cluster Drop In 'Top 500' List · · Score: 4, Interesting

    The cost he quotes for the Blue Gene ($200 million) was the cost of some government contract that included BG/L, ASCI Purple (a huge cluster of POWER5 servers) and some R&D as well.

    Recently IBM announced their commercial prices for BG machines (see e.g. theregister.co.uk or news.com.com). Prices start at $1.5 million (1 fully equipped cabinet). Using this price and published linpack figures one arrives at about 2.9 Mflop/s/$, compared to the maximum value of 2.2 Mflop/s/$ he quotes for the best apple system.

    Add in the fact that the BG uses much less space and power than a comparable xserve cluster, that it has a faster and lower latency network, and we have a winner.

  2. Re:Biodiesel on Could Nuclear Power Wean the U.S. From Oil? · · Score: 1

    I think it means that between the oil well and the tank of your car, about 20 % of the energy is lost. Mostly in refining, IIRC.

  3. Re:undecided on Japan's Newest Linux Supercluster: 13TB RAM · · Score: 1


    It might still be expensive, but it's currently the best CPU for large supercomputers


    Except for, say, POWER5, and vector processors (NEC SX-6, SX-8, Cray X1). If you by "best" mean raw performance and bandwidth, cost and power consumption be damned.


    SGI is using the best tool for the job.


    Perhaps they are, perhaps not. That's not the issue. The thing is that a number of years ago (when AMD64 was barely a blip on the radar) they made a strategic commitment to IA-64. Spending vast amounts of money to port all their stuff to another architecture and royaly pissing of customers and ISV:s just for a modest improvement in performance or price/performance of the cpu:s (which wouldn't matter much since the major cost of the Altix is the interconnect and NUMA stuff) doesn't sound like a cunning plan to me.

    That is, the IA-64 doesn't have to be the absolutely fastest chip on the block, as long as it stays competetive. It makes no sense for SGI to switch architecture for a 10 % performance gain.

  4. Re:undecided on Japan's Newest Linux Supercluster: 13TB RAM · · Score: 1


    Still, other systems do address large amounts of RAM. The ASCI-Q at Los Alamos has 33 TB (!) on alphas.


    But the ASCI-Q is a cluster, IIRC consisting of 2048 4-cpu SMP nodes. Thus each node only has 33 TB/2048 = 16 GB memory.

  5. Re:Luckily on Japan's Newest Linux Supercluster: 13TB RAM · · Score: 2, Informative


    At NASA sgi has been experimenting with 2048 proc single system image.


    The Columbia system still consists of 20 512-cpu systems, so I would assume this consists of 4 such 512-cpu systems.


    The cache coherency of SGI's cc-numa machines makes them increadibly easy to program. However, there is a big overhead.


    Well yes, the basic problem is that OpenMP/pthreads assumes a flat memory, whereas a NUMA box is all but flat. So the kernel better be real smart about how to map the memory onto the hardware to minimize remote memory accesses. And the programmer should of course avoid accessing memory from all over the place, although it's technically possible.


    Since most supercomputing software is written with MPI, rather than with posix-threads, you don't really behefit from it anyway. I think you can disable the hardware coherency on a per-process basis, which would greatly speed up MPI software.


    I'm pretty sure the SGI MPI implementation is pretty well optimized, either by using shared memory or by sending the messages directly over the numalink.

  6. Re:Not worth the outlay at present on RC4 Code Achieves 319 MB/s On AMD64 Opteron · · Score: 3, Informative


    486 = 32bits, faster but people still bought 386's due to cost.


    The 386 was also a 32-bit processor...

  7. Re:Evolution vs. Creationism on The Eye: Evolution versus Creationism · · Score: 1


    The same is true for intelligent design, unless we are willing to entertain the thought that the designers in this theory were in fact extra-terrestial lifeforms.


    But in that case, who created the aliens? God? Or perhaps evolution?

    That is, the entire "intelligent design has no opinion about who the designer is" argument is nothing but smoke and mirrors.

  8. Re:Face It on The Eye: Evolution versus Creationism · · Score: 1


    You don't see this sort of nonsense in Europe...


    I'm afraid you're wrong. At my university there was a creationism (oh sorry, "intelligent design") seminar a few weeks ago. We tried to stop it (i.e. kick it off campus and the educational program of the uni) with a petition signed in a few days by hundreds of faculty and students, but to no avail. :-(

    Of course, you're right in the sense that over here the number of christian fundementalists who actually believe in this bulls*it is marginally small, so they have no real impact on public policy as opposed to the USA. And hopefully they never will have..

  9. Grid Computing is a buzzword on Grid Computing: Conceptual Flyover For Developers · · Score: 5, Insightful

    Like, if you fit in "grid computing" in your grant proposal, the probability that you'll get funding increases. Now, if in addition to "grid" you manage to fit in "nanotechnology", "bio-informatics" and "paradigm" you'll be funded with a probability very close to 100 %!

  10. Re:And the point is? on C++ In The Linux kernel · · Score: 1


    I mean C++ is _a low level language_ unlike Java, and will perhaps replace Fortran in coming future as the tool for (*cough*) hardcore numerics


    I don't think so. There was lots of interest in C++ numerics around the time C++98 was finalized, but to me it seems it never really took off. I think the underlying reason really is simplicity. Writing high performance C++ code with template metaprogramming and all is pretty tricky, and even trickier to debug. By comparison Fortran is a very easy language to learn, way easier than even plain C. Consider that most people writing numeric code aren't computer scientists, so ease of use is certainly very important.

    Of course, if you need maximum performance, Fortran aliasing rules typically helps the compiler produce faster code than equivalent C/C++.

  11. Re:what is gnoppix for? on Ubuntu For PPC, And As A Live CD · · Score: 2, Interesting


    If gnoppix is based on Ubuntu, and Ubuntu is based on Debian, then who the hell is working on releasing sarge? ;)


    Well, why do you think it has been 2+ years and counting since the release of woody? ;-)

  12. Re:The worst thing about this... on SGI & NASA Build World's Fastest Supercomputer · · Score: 1


    Like what? Go out and look up SPEC results next time you're bored. I think you'll find that I2 is quite a bit more capable than you make out. IBM's dual-core POWER5 is just about the only thing out there that's even close to (a single-core) I2 in FP performance, and Opteron isn't even in the game at that level.


    Don't know about SPEC, but there's plenty of CPU:s that beat the I2 on linpack (which, as you certainly know, is used to rank the top500). The I2 scores about 4.8 Gflop/CPU on linpack. For example the PPC 970, used in the Apple clusters, scores about 5.7 which I consider pretty amazing considering how much cheaper the PPC970 is. The Opteron does quite poorly on linpack, only about 3.1 Gflop/CPU, but my understanding is that opterons somewhat offset this in real world applications due to its very low memory latency.

    Not to mention vector processors, who easily toast all the scalar processors. The earth simulator scores about 7 Gflop/cpu, and it's recently released successor SX-8 scores about 16 Gflop/cpu. The Cray X1 vector machine is ~13 Gflop/cpu, and the soon to be released X1E is ~19 Gflop/cpu. *drool*

  13. Re:cray on Cray XT-3 Ships · · Score: 1


    Cray today looks like they are doing cool things. xt3 is pretty neat.


    XT3 certainly is a very interesting architecture. However, I do wonder how it will succeed economically. It's trying to squeece in on a pretty crowded market, with both clusters on the low end and blue gene very shortly providing superior density at very competetive prices.


    I'm curious to see how they combine it with the x1 vector stuff.


    Crays roadmap indicates that they plan to share tech as much as possible between the X1 and the XT3 families, to the point that sometimes in the 2010 timeframe the only difference will be the cpu boards. Everything else will be the same, interconnect tech, software, etc. Even going so far that you could have different cpu boards in the same system!


    Hopefully their three current product lines can share a lot of technology in future revisions.


    Do you mean XD1 as the third line? Personally, I think XD1 is somewhat of the black sheep. As I see some significant overlap between XD1 and the XT3, perhaps they simply bought Octigabay to avoid them meddling in their future XT3 market?

    At the moment, I think Crays trump card is the X1 family. Most users have apparently been very pleased with the performance of the systems on real world codes. Also, the only competitor in the vector business is NEC, so they can probably sell X1:s with higher margins than the XT3. Soon they'll have their X1E ready, with 3 times the compute density compared to the X1. That is, the X1E will be within a factor of 3 of the compute density of the Blue Gene! The NEC SX-8 as such seems like a very competetive cpu to the X1E cpu, but NEC has an appalingly low density of only 8 cpu:s per cabinet whereas Cray crams 128 in a single X1E cabinet.

  14. Re:Article Comparison... on Virginia Tech Supercomputer Up To 12.25 Teraflops · · Score: 1


    My question would have to be: Teraflops - is it purely an aggregation of processor power, or does it take into account things like interconnects?


    Take a look at top500.org. For every supercomputer you'll see two numbers, Rpeak and Rmax. Rpeak is a purely theoretical estimate, basically number of floating point operations per clock cycle times the clock frequency times the number of cpu:s in the system. Rmax is the result from running the linpack benchmark.

    The linpack benchmark is run on the entire system, so it takes into account all the interconnects, but unfortunately linpack is almost "embarassingly parallel", e.g. for a cluster the interconnect makes almost no difference, ethernet is as good as some supah-dupah $$$ interconnect.

  15. Re:The math for a comparable Xserve system on Cray XT-3 Ships · · Score: 1


    If your application is not parallelizable, the supercomputer pisses away on you ?


    Only in Soviet Russia.

  16. Re:What is a supercomputer ? on Virginia Tech Supercomputer Up To 12.25 Teraflops · · Score: 1

    I don't think there exists any non-ambigous way to define what a supercomputer is.

    Anyway,

    I think we can disqualify @HOME style projects, since the individual nodes are not under the control of the manager. Similarly, you can't submit some small batch job to a @HOME system and expect to have results within a short time. Uh, that wasn't a very good description but I hope you understand what I mean.. i.e. that to qualify as a supercomputer all the nodes should be dedicated to the supercomputing stuff, and be under the direct control of the administrator.

    As for the one node vs. cluster of nodes, it gets trickier. How do you define one node? Shared memory? But then, what about NUMA systems such as the SGI Altix? It is entirely valid to view NUMA systems as consisting of multiple connected nodes, along with some kernel (and usually hardware) support to make it appear as shared memory. Hardware-wise there's no huge difference between such a system and a cluster, essentially the only major difference is that NUMA systems typically have some silicon to take care of cache coherency.

    Or should we limit ourselves to shared memory systems where all the memory sits on the same bus? This limitation would seriously limit our ability to build really huge systems, simply because the speed of light would cause ever bigger latencies. Not to mention that this limitation would prohibit even a simple dual cpu AMD Opteron system, which is a NUMA system. So I don't think this limitation is good either.

    Of course, we could say that a real supercomputer is distinguished by running a single kernel for the entire system. That would allow NUMA systems, but disallow clusters. Anyway, I think this limitation sounds a bit artificial.

    In light of the above reasoning, I think we must accept clusters as legitimate supercomputers. As long as they have enough oomph to make the top500 or thereabouts, that is. Not that linpack is any perfect benchmark, far from it. Oh well, perhaps HPC Challenge or something like that will someday replace linpack as the "official" benchmark for top500.

  17. Re:"Dick factor" aside on Virginia Tech Supercomputer Up To 12.25 Teraflops · · Score: 3, Informative


    Would be interesting to know exactly what stuff do these machines do? Maybe they would even be able to share some code so that people can fiddle around with it optimizing


    I don't know about the VT cluster specifically, but here's a couple of typical supercomputer applications that happen to be open source:

    ABINIT, a DFT code.

    CP2K, another DFT code, focused more on Car-Parinello MD.

    Gromacs, a molecular dynamics program.


    (should be fun)


    Well, if optimizing 200 000 line Fortran programs parallelized using MPI sounds like fun to you, jump right in! ;-)

    Note: Above applies to abinit and cp2k only, I don't know anything about gromacs except that it's written in C, not Fortran (though inner loops are in Fortran for speed).

    Oh, and then there's MM5, a weather prediction code which I think is also open source. I don't know anything about it, though.

  18. Re:The math for a comparable Xserve system on Cray XT-3 Ships · · Score: 4, Insightful


    What a value!!


    That is, until you throw a tightly coupled problem at it and the Cray is 10 times faster because it has much better internode bandwidth and lower latency.

    And, you forgot to count the cost of the InfiniBand interconnect that the VT cluster used? That's a couple grand per node.

    Bottom line, apples and oranges. If your applications is easily parallelizable (i.e. doesn't require much communication between the nodes) you'd be stupid to piss away your money on a "real" supercomputer instead of a cluster. And vice versa.

  19. Re:Just the name brings back memories on Cray XT-3 Ships · · Score: 4, Informative


    There are two prominent applications for these machines. The first is nuclear weapons simulation. Personally, I don't see the point to that. The other application is in weather prediction.


    Oh, please. Buy a clue, will ya? There's lots and lots and lots of applications that use supercomputers, or could use if they were more affordable. A few examples from the top of my head:

    Materials science, that is ab initio simulations, moldyn, you name it. This alone probably uses > 50 % of all supercomputer cpu time in the world. By comparison, weather prediction and nuke simulations is small potatoes (or shall we say, the simulations as such are big, but the number of people engaged in weather prediction or nuke simulation is really small compared to all the supercomputing materials scientists).

    CFD, the automobile and aerospace sectors are big users.

    Electronic design.

    Seismic surveys, the oil industry uses lots and lots of supercomputers to find oil deposits.

    Biology. Gene sequencing, moldyn simulations of lipid layers and whatever.

    Climate prediction, somewhat related to weather prediction. Official purpose of the Earth Simulator.

    All of the examples above could easily use almost any amount of cpu power you can throw at them. The only thing that stands between a lot of scientists and improved understanding of the world is computing power.

  20. Re:Missing the point on Free Software Friendly Graphics Card? · · Score: 1


    That doesn't mean I wouldn't still be potentially interested in spending a little extra money on a card because I thought it was coming from a cool company.


    Fair enough.


    Did you read the post?


    Yes.


    The guy is proposing a video card with an FPGA onboard. For an interesting tinker toy, that makes it more interesting than just a dumb frame buffer.


    Yes, I noticed the FPGA thing. But I don't see why I should be interested in the fpga. If I wanted to do some FPGA programming, I could get an FPGA from lots of places. Buying a video card for the sake of its FPGA sounds a little far-fetched to me.


    Oh, and can I put that radeon in my Mac?


    Probably not the exact same radeon. Ati does sell "mac editions" of some cards, including the 9200. But OTOH, if you're willing to run a closed source OS I don't see why a graphics card with an open source driver would be such a big deal to you. But whatever floats yer boat, I guess.. ;-)

  21. Re:Missing the point on Free Software Friendly Graphics Card? · · Score: 1


    I'd like to know just what sort of 3D performance this thing will do. The OP suggests it will have some 3D features, so as long as it'll play a decent game of Quake II or Quake III, it'll probably be worthwhile at $100 US. If we are talking about a Quake I card at $300, then I wouldn't buy it.


    Uh? You are aware that for $50 you can get a radeon 9200 which has enough 3D oomph to play quake III, with 100% open source drivers?

  22. Re:Why didn't it succeed? on 30th Anniversary of Pascal · · Score: 1

    Quiche eaters use Pascal. Real Programmers use FORTRAN, or simply flick the swithes on the front panel. ;-)

  23. Re:Sigh, Except for 3D Rendering on Intel Scraps Plan For 4 Ghz P4 Chip · · Score: 1


    It works better than a 2 CPU machine. Two cores are two CPUs (they couldn't be slower!), but they have the advantage of being able to have a much higher speed interconnect since it's implemented in silicon, not through logic and wiring.


    In practice a multi-core is slower than separate, because they have to share the same bus to main memory. Well, at least in my experience with the POWER4, that is.

  24. Re:I left my mission statement paperweight in the on Croquet Project Releases Initial Developer Release · · Score: 1

    It's a paradigm!

  25. Re:Meanwhile, C++ goes nowhere on Java 1.5 vs C# · · Score: 1

    Luckily Fortran, with the advent of the new Fortran 2003 standard, is stronger than ever. ;-)