Slashdot Mirror


User: flaming-opus

flaming-opus's activity in the archive.

Stories
0
Comments
368
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 368

  1. Re:Time for vector processing again on IEEE Says Multicore is Bad News For Supercomputers · · Score: 1

    Back in the 90s, there were custom super-computer processors (both vector and scalar), that were faster than desktop processors for all supercomputing tasks. This hit a wall, as the desktop processors became faster than the custom processors, at least for some tasks. If you can get a processor that's faster for some tasks and slower for others, but costs 1/10th the price of the other, you're probably going to go with the cheap one. The world has petaflop computers because of the move to commodity parts. Noone could afford to build 160,000 processor systems from YMP processors.

    btw, multi-cores are pretty terrible for desktop applications. They really excel for server transaction processing, but most desktop users haven't any use for more than 2 cores. A radical shift in programing is going to be needed before massively multi-core processors are any use to a desktop user.

  2. There are still vector processors out there. on IEEE Says Multicore is Bad News For Supercomputers · · Score: 2, Insightful

    NEc still makes the SX9 vector system, and cray still sells X2 blades that can be installed into their xt5 super. So vector processors are available, they just aren't very popular, mostly due to cost/flop.

    A vector processor implements an instruction set that is slightly better than a scalar processor at doing math, considerably worse than a scalar processor at branch-heavy code, but orders of magnitude better in terms of memory bandwidth. The X2, for example, has 4 25gflop cores per node, which share 64 channels of DDR2 memory. Compare that to the newest xeons where 6 12 gflop processors share 3 channels of DDR3 memory. While the vector instruction set is well suited to using this memory bandwidth, a massively multi-core scalar processor could also make use of a 64-channel memory controller.

    The problem is about money. These multicore processors are coming from the server industry. web-hosting, database-serving, and middleware crunching jobs tend to be very cache-friendly. Occasionally they benefit from more bandwidth to real memory, but usually they just want a larger L3 cache. Cache is much less useful to supercomputing tasks, which have really large data-sets. The server-processor makers aren't going to add a 64-channel memory controller to server processors; it wouldn't do any good for their primary market, and it would cost a lot.

    Of course, you could just buy real vector processors, right? Not exactly. Many supercomputing tasks work acceptably on quad-core processors with 2 memory channels. It's not ideal, but they get along. This has put a lot of negative market pressure on the vector machines, and they are dying away again. It's not clear if cray will make a successor to the X2, and NEC has priced itself into a tiny niche market in weather forcasting, that is unapproachable by other supercomputer users, for price reasons.

  3. Re:Attempt to sensationalize? on New Top 500 Supercomputer List · · Score: 1

    right you are. The contracts for these machines were signed a couple of years ago. They might have sped things up, in order to get on the top500 list, but they didn't add hardware just for a little showmanship. These labs can afford to put out a bunch of press releases related to top500, but they don't care enough about it to spend many millions of dollars.

    The list reflects the computers, the computers don't exist for the sake of the list.

  4. Not New. on AIX On the Desktop Is Getting the Boot · · Score: 1

    I think it's amazing power workstations lasted as long as they did. SGI quit the biz years ago, DEC is lost in the annuls of history, sun sells workstations, but they are just PCs. The workstation market is gone, history, no more. It's been almost 10 years that an intel/linux box with a good gamer graphics card has been outrunning dedicated workstations costing ten times as much.

  5. Re:Looks like Cray jumped the gun... on New Top 500 Supercomputer List · · Score: 1

    The press releases from Oak Ridge and Cray (unlike the summary on slashdot) were careful to claim jaguar as the fastest computer for "open science". They were, no doubt, aware that Los Alamos might have bought more hardware since June.

    The machine at Los Alamos is used for classified Department of Energy projects; probably simulating nuclear warhead functionality on the aging pile of B83s and B61s sitting in the US arsenal.

    The machine at Oak Ridge gets used for unclassified research that ends up in peer-review journals, on a variety of topics. Stuff like Climate models, Fusion energy research, protein synthesis models, cosmology. I'm sure there's a little competiton between the labs, but they really have different missions, so I bet they don't put too much stock in it.

  6. Re:Looks like Cray jumped the gun... on New Top 500 Supercomputer List · · Score: 1

    I suspect cray is much more interested in getting paid to build really complex supercomputers. Supercomputer vendors don't compete in the top500, supercomputer buyers do. Cray could build a 2 petaflop computer, or 5 or 20, if a customer came along with a large enough check. So could IBM, or HP, or SGI. It's really up to the customer.

    That said, customers are only marginally interested in getting to the top of the top500. Those computers have a job to do, and that job isn't getting to the top of some artificial benchmark that isn't all that representative of the real jobs run by the users.

    IBM's roadrunner has a lot fewer nodes, but I'm not sure I'd say it has fewer processors. Each node has 2 opterons and 4 cell processors. Do you program one MPI rank on the node, or on an opteron core + cell, or on each cell SPU? IF you're using pure MPI programming, that's really 36 processors per node. If you use hybrid MPI/openMP then you really reduce the number of MPI ranks. The hybrid approach requires a smaller order of MPI parallelism, but requires a much higher order of thread parallelism and vector parallelism. There are some codes that are really going to fly on the cell processors of roadrunner, and some that will crawl. Los alamos obviously decided that the codes they care about are likely to run well on that machine. Oak Ridge, when taking bids for Jaguar, decided that more traditional processors with a lot of memory capacity and interconnect bandwidth were more suited to their codes. I'm sure that neither lab used linpack to decide which machine to build.

  7. Re:Folding@Home Contribution? on Jaguar, World's Most Powerful Supercomputer · · Score: 1

    There's no reason, whatsoever, to use a highly-connected, high-bandwidth HPC machine, like Jaguar, on distributed computing jobs. There are other very worthy jobs that can be run on such a system, that can't be run on a pile of desktops all over the internet. Use the real supercomputers for real supercomputer jobs. There are plenty of idle xbox in the world for distributed computing.

  8. Re:Who actually wants this? on Unholy Matrimony? Microsoft and Cray · · Score: 1

    Well, if you're in a windows shop, and you have a modest need for HPC tasks, it might be more familiar to you. They're trying to sell it into colleges, business-units, etc, not into big science labs.

    Lets say you're a professor with a pretty flush grant. You want to do some modelling, but you don't want to have to build out a full HPC environment. You can probably get the college IT department to administer an 8-node windows cluster. IF not, you can probably hire an admin on the cheap, and it doesn't look much different from any other windows box. You write a couple of MPI applications, and they get less performance than a finely-tuned linux system might give you, but a lot better than running it on the desktop. Meanwhile the professor/grad students only have to worry about figuring out how to write MPI algorithms; They don't have to also learn how to run a linux cluster.

    It's different tool for a different market. Don't try to compare it to the kind of lab that runs a real cray.

  9. Re:How far can a company sink? on Unholy Matrimony? Microsoft and Cray · · Score: 1

    It's a low end offering for a low price. A lot of things would wipe the floor with this thing, but not for $25,000-$60,000. It's not really a cray, it's a low-end cluster with a cray logo slapped on the front.

  10. Re:Too dinosaurs working together. on Unholy Matrimony? Microsoft and Cray · · Score: 1

    Cray has made some terrible choices, in terms of acquisitions. At least rebadging doesn't cost anything.

    The EL was bad, but it evolved into J90 and SV1, which were both pretty decent machines. I don't know of anyone who bought an XD1, though they looked pretty compelling. I just don't think of Cray as having the sales force to sell these sorts of machines. You'd have to move a lot of them to be worth the effort.

  11. Two dinosaurs working together. on Unholy Matrimony? Microsoft and Cray · · Score: 1

    It's not that the PC manufacturers don't care, it's that they can't justify the cost, given the needs of the customers. The Cray X1 had 32 memory channels, and a time when most high-end servers had 2. To do that, you need to have a lot of pins coming out of the processor die, and lot of traces through the motherboard, and a lot of sockets full of memory sticks. As a result, you pay $100,000 per board. It's not just about engineering costs, it's also about a really high unit cost.

    PC makers aren't interested in more memory channels, as it increases the cost of processors, motherboards, and of filling the memory slots with dimms. For most PC applications, the best way to improve memory bandwidth is to increase the size of the L2/3 caches, thus more of the data is in high-bandwidth on-die memory. Not so much with HPC applications.

    There is one area, in a PC, where real memory bandwidth maters, and matters a lot: GPUs. The high-end gamer cards are pushing the memory envelope, at least a little. GDDR5 has quite a bit better bandwidth than the DDR we see on CPUs, and much wider buses. You still can't use graphics cards for HPC apps, as they don't support 64bit floating point math. When they do, however, I think there might be some clever ways to use all that bandwidth; much like an old Cray vector machine. Soon is what I'm hearing.

  12. Re:You're 180 degrees off. on Unholy Matrimony? Microsoft and Cray · · Score: 1

    Well, the CX1 thing looks like it's only 8 nodes, so it's a pretty tiny cluster. With that few parts, I bet the mean time to failure is a little better than a big system.

    The shuttle runs a radiation hardened version of the IBM S360 mainframe processor from the late 60's. As for an operating system, I'm not sure it has an operating system, exactly. I imagine it runs 1 and only 1 application.

  13. Re:You're 180 degrees off. on Unholy Matrimony? Microsoft and Cray · · Score: 1

    The topography of the network is not really what's most important, though it can have a huge impact on price. Also, infiniband can be really expensive. I'm not sure that Cray Seastar is really any more expensive than infiniband.

    Some thoughts on topographies: Fat Trees are nice, in terms of global bandwidth, but the cost can be very high, depending on the radix of the router, the cost of the router ports as compared to the cost of the end-points, and the speed of the cabling. Fat trees can create some very difficult networks to wire, as higher level routers can be quite physically distant from one another. The longer the wires, the more expensive, and the slower you have to clock your carrier signal.

    Torus networks, like Cray Seastar, are really a relic from days when a 7-port router was the only cost effective thing you could build. They're really simple to wire, and the longest cable is only as far apart as the cabinets.

    If you look at some of the journal papers coming from Cray and their partners in academia, it looks like they may be heading in the direction of a "flattened butterfly" topology, which looks like a very nice compromise between price and scalable bandwidth.

  14. You're 180 degrees off. on Unholy Matrimony? Microsoft and Cray · · Score: 1

    Actually, their prices have been steadily declining. Due to competition, I suppose.

    The big HPC customers, are actually increasing the size of their machines much FASTER than moore's law. Thus the number of processors in a supercomputer is growing, and growing rapidly. With MPI jobs of 100,000 cores, the demands on the interconnect go up, and the difference between a really scalable interconnect, and commodity clusters, becomes more obvious.

    Also, the big HPC companies (Cray, NEC, IBM) make their money selling the hardware, but a lot of what they do is software. It's one thing to build a computer that can perform a quadrillion floating point operations per second; making it actually useable, and sorta stable, is another thing entirely.

  15. Re:Too dinosaurs working together. on Unholy Matrimony? Microsoft and Cray · · Score: 5, Interesting

    I disagree, but then again, I work in the HPC industry.
    1. Standard computers have already taken over all of those jobs that used to require a supercomputer. There's no more market to loose. HPC is a 6-7 billion dollar market. The TAM is growing slower than the rest of the IT industry, but it's still a large niche market.

    2. Clusters got really popular for a few years, but have really fallen out of favor at the high end of the HPC market. That said, the difference between a high-end super, and a cluster, is rather small. Thankfully the price difference is shrinking too. Moreover, this product IS a cluster. It looks like an attempt, by Cray, to get into the low end of the HPC market. Cray, like everyone else, would like to be the company taking market share away from itself, rather than let someone else take it.

    3. IBM has a compelling strategy of reusing their high-end POWER-X processor super-servers, and selling them as supercomputers. The problem with this, is that they are obscenely expensive as supercomputers. A high-end database server has a whole pile of functionality that is completely unnecessary for HPC jobs, both in hardware, and in software. Big iron servers are also WAY more expensive, per-processor, than a super. As such, IBM is also making supers out of commodity clusers, commodity clusters with CELL coprocessors, and BlueGene, which is much closer to CrayXT than it is to an IBM mainframe or superserver. I would argue that IBM's diversity may work against it, in the HPC market, as it tries to fit a round peg into a square hole.

    I'm not sure Cray will be very successful with this CX1 product, or generally, selling to the low-end HPC market. That, however, is not reason to believe that there is no need for venders specialized in HPC systems. Cray has made quite a comeback, in the last few years. The reason one thinks of Cray as a dinosaur, is that the HPC market is so much smaller now, relative to the entire IT industry, compared to the 1980s. Nonetheless, it's still an important niche.

  16. Re:Memory Bandwidth on IBM's Eight-Core, 4-GHz Power7 Chip · · Score: 1

    well, in the really sophisticated clusters like Cray or NEC MPPs, you can actually open up a window of memory on a node, and remote nodes can load/store directly to that memory. Obviously it's slower than local memory, but you can program it like a shared memory map.

    IBM, however, prefers the hybrid mpi/openmp approach for their clustered smp constellations.

  17. Almost on "Intrepid" Supercomputer Fastest In the World · · Score: 1

    "It's [google's data center] good at what it does, and what it does is very important commercially, but that doesn't earn it a space on this list."

    This is the only false statement in your posting. Google's data centers are, in fact, a huge pile of intel/AMD processors connected with a couple lanes of gigabig ethernet. True, they are not designed for HPC, and therefore cannot compete with real supercomputers on REAL hpc applications. However, the top500 list is generated using linpack. Linpack is a terrible representation of performance on real HPC applications. Linpack almost exclusively rewards FP ALU throughput, scales almost perfectly on multicores, and requires very little of the interconnect, and has very modest memory bandwidth needs. Linpack is about the only HPC application that would work on google's data center. I bet they couple put together some pretty goood scores if they wanted to, but those machines are too busy making money to run silly benchmarks.

    Otherwise, you're spot on, though it would help if you'd take the chip off your shoulder.

  18. real measure on "Intrepid" Supercomputer Fastest In the World · · Score: 2, Insightful

    Well, the real measure of fastest computer has a lot to do with what software you want to run on it. In the example of the top500 list, linpack scales almost perfectly as you add processor cores, and makes very limited demands of network speed, memory bandwidth, or single-processor performance. Other codes really can't scale past 16 processors, so these massive processor jumbles don't amount to a hill of beans.

    Most codes are somewhere between. As the machine gets larger, the more effort has to be put in designing the software to actually use all the hardware.

  19. Re:big iron on Cell-based "Roadrunner" Tops Elusive Petaflop Mark · · Score: 1

    yes, that's the other thing about supercomputers, they have to compete with commodity clusters, so are quite inexpensive compared to big servers. You'll notice that IBM sells many lines of technical computers including opteron clusters, cell based accelorators, Blue Gene, and also the power6 clusters. The power6 clusters are essentially a bunch of big-iron unix servers ganged together on an interconnect. They work pretty well, but are very expensive.

    It's not really vector ops that makes the difference. The big MPP systems like blue Gene, Cray XT, and SGI altix don't use vector processors either, but they scale to many thousands of processors, not a few dozen.

  20. big iron on Cell-based "Roadrunner" Tops Elusive Petaflop Mark · · Score: 1

    from the standpoint of supercomputers, the big SMP systems from SUN/IBM/HP are not big-iron. They are baby sized. They are also machines design differently, to solve a different set of problems.

    The big sun/HP servers are designed to host enterprise-sized databases, supply-chain/ business-intelligence / operations server jobs. They are generally highly parallel transaction processors, not running parallel compute tasks. This doesn't make them easier to design or build, far from it, but it does mean that the requirements are different. You will notice that linux is just now making inroads onto the really large business SMPs. These vendors, as well as the peoplesoft/SAP/oracle/novell/websphere/etc-types have spent decades and billions of dollars developing software that will efficiently use those monster machines. They're pretty amazing systems, but they are not supercomputers.

  21. difficult to program. on Cell-based "Roadrunner" Tops Elusive Petaflop Mark · · Score: 1

    Yes, this machine is difficult to program, but that is true of all capability class supercomputers. This one will be a little bit more difficult than others, in so much as it is not a simple evolution of an existing design. However, even when you upgrade from 1000 nodes of power5 to 2000 nodes of power6, you still have to do a lot of tuning to your codes to get them to run well. That's the rule, not the exception, in large supercomputer instalations.

    This machine is very expensive. I would suspect that more traditional designs could hit a petaflop for less money. The must believe that this programming model will perform well for several of their critical codes, or they would not have spent so much money.

    Furthermore, since the machine is so expensive, and will be available for open research for only a limited time, one can assume they are only going to run a handful of codes on it. They have probably already taken the time to optimise those codes for the wierd architecture.

  22. what drives are for. on Sun Adding Flash Storage to Most of Its Servers · · Score: 2, Insightful

    You're confusing two very different sorts of storage. There is bulk data storage. This is a fileserver for home directories, video archives, piles of email, that sort of stuff. This is the market where the 1TB sas drive thrives. Then there's the database backing store. Almost every customer I've sold to wants a huge number of very fast, very small drives for database backing store. The extra capacity is meaningless, as they have to use so many spindles to get a decent IOPS performance. In this area, selling drives hasn't been about capacity for 10 years. IOPS, in particular read IOPS is your throttle point for these. Now that flash drives are beginning to get traction for high-end laptops, and we have affordable, SDD drives, with industry standard interfaces, there's no reason NOT to use them.

    Also, fibre channel drives already cost $1000, so paying this much is nothing new for enterprise customers. An enterprise server with LESS than $50,000 of storage would be the oddball case.

  23. Re:More likely NSA on Cray, Intel To Partner On Hybrid Supercomputer · · Score: 1

    Possibly, though I doubt it. The big supercomputer makers out there: IBM, NEC, SGI, Cray, HP, Sun can't tell you that they are selling a computer to the intelligence community, but they do tell their shareholders how many systems they are selling. If you subtract all the non-secret systems, you're left with all the secret systems, or at least the revenue from those systems. You can't tell exactly how large of systems they get, nor exactly what they look like, but you can tell approximately how much they are spending on big supercomputers. Of course, that doesn't include any specialized, custom computers they build for themselves.

    I don't know exactly what the NSA does with computers, but if it's anything like spying on all our phone-calls, I imagine they need data-mining more than floating-point number-crunching. I think they probably spend more on storage, and big database servers than they do on supercomputers. That is, however, purely speculation.

  24. Department of Energy on Cray, Intel To Partner On Hybrid Supercomputer · · Score: 2, Informative

    DOE, which does the US nuclear weapons simulations, is probably the largest single buyer of capability-class supercomputers, but still a small fraction of the total. Even within DOE, only a large minority of systems are dedicated to Nuke simulation. Sandia, livermore, and Los Alimos all have 2-3 large nuclear simulation machines each. (or will admit it publicly) Large systems at Pacific Northwest, Oak Ridge, Lawrence Berkely and Argonne are used for open science research.

    High-end supercomputers are used, in significant ways, for climate research, short-term weather forcasts, seismic modeling, cosmology, fusion research, protein folding, predicting the size of petrolium deposits, automotive and aircraft designs, and a host of other engineering codes. Even with that stated, the piece of the pie chart labelled "other" is 35% of the total.

    On the other hand, nuclear weapons simulation is a difficult enough problem, and requires a powerful enough machine, that it subsidizes the design of super-scalable machines that are then sold to other customers for other tasks.

  25. Re:Hmmm.. on Tech That Will Save Our Species - Solar Thermal Power · · Score: 1

    It seems very unlikely that the demand and price of oil will go down, even with billions of dollars of investment in alternative fuels. At best, I suspect we can decrease the rate at which the price increases.