flaming-opus · Slashdot Mirror

Re:Irony emulator on NASA To Get 10,240 Node Itanium 2 Linux Cluster · 2004-08-09 05:16 · Score: 1

If it aint broke, don't fix it. The only way to have a device that's very, very reliable is to keep it very, very simple.

The shuttle's computer, btw, is a scaled back version of the IBM s/370 mainframe processor.

speed vs. Correctness on Windows Accelerators - Do They Really Work? · 2004-08-02 13:27 · Score: 4, Insightful

Lets just pick filesystems and buffer caching as one area of an operating system that can be tweeked to show some phenomenol performance gains. If you remove all the synchronous I/O requests made by a filesystem, you can improve performance on the slowest operations by orders of magnitude. However, watch out if you loose power in the middle of extent allocation and end up writing binary file data over the top of the root directory.

You can short circuit a lot of semaphores in the OS and speed up any operations that require concurrency. It'll work most of the time, and trash your data 2% of the time. If you don't need correct behavior, speed can be had more easily.

That said, windows is built to run decently on some pretty odd hardware. If you strip out all the unnecessary drivers, and set up some better config defaults for your hardware you can make some big gains. Setting memory zone preallocation, default filesystem allocation size, maximum table lengths, I'm sure you could easily add 75% to your performance ON AVERAGE. I am, however, extremely skeptical of any claims about game frame-rates. Games interract with the OS minimally, and are mostly hardware bound.

-my $.02

Re:The classic supercomputer is the modern desktop on On the Supercomputer Technology Crisis · 2004-07-29 09:59 · Score: 1

The G5 is not really a vector processor. It allows you to use a 128bit register as 4 32-bit registers, or 16 8-bit registers. This allows one to do more math with fewer instructions. This is cool for multimedia apps, but it's different from real vectors.

A real vector architecture is not done that way to do math more quickly (thought it's pretty good at that), but rather to isolate the CPU from the ridiculous latency of memory. Since the op V1 = V2 + V3 takes many many cycles to complete, the load that's necessary for the following operation has time to get something from memory. The vector guys found that throwing more transistors and more wires adds to the bandwidth, but it's a lot harder to reduce latency, so they hid it instead.

Re:How much did they pay for this thing? on SGI & NASA Plan 10240-Processor Altix Cluster · 2004-07-28 06:46 · Score: 2, Interesting

I doubt intel is pumping a lot of cash into sgi, but they may have cut them a real deal on the chips. When I first read about this computer I thought "what a coup for sgi." Then I read the dollar mark and thought "what a coup for nasa." $45million including storage and fibre channel? That's less than $2million per on those 512 proc altix boxes. They're not making much margin on those.

To counter all of their detractors, Itanium2's are pretty hot processors, and SGI has done an amazing job getting linux to run well on 500 processors. This will no doubt be one hell of a fast machine. I'm just amazed that sgi can stay in business selling at this price.

If anyone has been watching supercomputers lately, you might have noticed that cray is downsizing by 20% after selling 0 computers in their most recent quarter. It's a tough market for supercomputer makers. The clusters are good-enough-and-cheap. This box is being sold at very little more than cluster cost. Ick.

Re:I hope he's right on Linux vs. Windows: What's The Difference? · 2004-07-02 02:29 · Score: 1

Well, that's largely independant of the article's point. He's talking about the features of the kernel. It turns out that the kernels are now very similar, at least in supported features. They both are monolithic, preemptible, reentrant code with loadable kernel modules. They both have kernel-mode networking interfaces, switchable and loadable filesystems, synchronous and asynchronous i/o, NUMA-aware memory allocation, and medium-grained kernel locking. They both will run well on 8 processor SMPs, and will hobble along on 32 processors.

Having written filesystems and device drivers on both systems I would have to say that it's a lot more difficult on windows, but the development and debugging tools are better. (Microsoft is not always forthcoming about how things work. If MS had a shared-source, or even read-only-open-source a lot of the difficulty would go away.) All in all, why would the features of the OS be very different? IF you start adding too much to the kernel, you probably should start asking if the kernel is the right place for all those features. The difference is now down to refining the implementations.

As for coming up with wizards and familiar interfaces: It's not hard, it's just a lot of work.

what temperature, what topography. on EPA Fuel Economy Myth: Too High, Too Low? · 2004-06-29 14:43 · Score: 1

If you don't ever have to go up and down hills, and always drive between 45 and 85 degrees Farenheit, that's going to make a big difference in your milleage. Tire preasure too.

Re:I don't follow the numbers on Army Contractor To Build A 1566 Xserve Cluster · 2004-06-22 02:30 · Score: 1

They will need more than just embarassingly parallel problems. Those peak scores are generated with codes that can fit their data into the L2 cache, and can fit the entire executable into the L1 instruction cache. They also heavily rely on almost everything being pushed through the fused-multiply-add (counts as 2 fp ops in linpack) functionality of the 970's FP pipeline.

The 970 has very good memory bandwidth for a scalar CPU, but it's nothing compared to cache speed. Parallelization issues are not the only things that keeps a CPU from running at peak speed.

Re:IBM's Blue Gene on Top 500 Supercomputer List Released · 2004-06-21 02:03 · Score: 5, Informative

Did they mention why myrinet and infiniband are heat sensitive? I've used myrinet before, and did not encounter any problems with it, though I was not using 1U dual-CPU systems. (just a bad idea in general) A myrinet card includes a pretty high-clocked ASIC that runs warm for a network card, but is nothing compared to most graphics cards these days.

Blue Gene is an amazingly simple, and crafty design, with efficiency at its heart. I'm not sure that it will be as successful as the IBM marketing machine claims it will, but it's exciting none-the-less.

The trend in CPUs, over the last ten years or so, has been to maximally fill long, wide super-scalar pipelines. The Power4 has half a dozen execution units and a 15 stage pipeline, running at 1.7 ghz. To keep that full, one has to have exceptional branch prediction, huge caches, and superb compilers, and tons of memory bandwidth.

The Blue Gene approach is to have fewer, shallower, lower-clocked pipelines, but lots of CPUs. Their peak speed is a quarter of the top CPU designs, but their real speed is half of the big guns. Since they are using today's chip technology to implement yesterday's chip designs, they use little power, and are very inexpensive. Since IBM has cleverly integrated all the communications networks and memory controllers, you only need three components in the system: CPUs, RAM chips, and passive circuit boards - plastic and copper. (Yeah, I'm sure there is other stuff, but not much)

The design is not revolutionary, it's a fairly intuitive evolution of the Paragon, or the T3E. This sort of system may not be perfect for every task, but will excell at the sorts of tasks that already work well on big clusters. That, and it will likely be very cost effective.

Re:Linux clusters still rule on Top 500 Supercomputer List Released · 2004-06-21 01:41 · Score: 2, Interesting

well, in a supercomputer OS, you really only have two choices. You can create a microkernel OS that runs on al the computation nodes, and does system calls to services nodes.

you cluster together a bunch of monolithic kernels. At 8000 processors you aren't going to be able to use 1 monolithic kernel, so the distinction between a medium scalable OS like linux and a large scalable OS like solaris/irix is a bit of a moot point. 1000 OS images instead of 250? It's a nuisance either way.

Re:What I find interesting... on Top 500 Supercomputer List Released · 2004-06-21 01:35 · Score: 5, Insightful

I did some contract work at ILM several years ago, and know why this is. They don't use one big machine, but rather a bunch of medium sized clusters. This is for a very good reason. Weta has, thus far, worked on one big movie at a time, where all of their resources are dedicated to a single data set. ILM is constantly working on half a dozen moveis all at once.

In essence, they lease some amount of resources to a particular movie studio for some number of months. At the time they were doing this with row upon row of 32 processor SGIs, but they are probably using something else these days. Thus no spot on the top500 list. However, since they are in the business of making movies, I bet they don't really care.

Re:How do they measure? on Top 500 Supercomputer List Released · 2004-06-21 01:28 · Score: 1, Interesting

They measure with linpack, which only measures processor computational performance, but ignores memory, interconnect, and I/O performance.

This is why the US government uses HPC challenge benchmark, in which Linpack is only one measure among eight.

Re:Disk Fragmentation on Measuring Fragmentation in HFS+ · 2004-05-19 08:28 · Score: 1

This is two forms of the same problem. Most people here have been talking about a fragmented file, which slows access to an existing file. You're talking about fragmented free-space which slows creation of a new file (or new data to an existing file). The problem with fragmentation avoidance schemes, is that they tend to fix file fragmentation at the expense of free-space fragmentation.

For most users that's a good trade-off, as reads tend to dominate writes by a factor of 5.

Re:cray and fast computing -- I don't think so on World's Fastest Supercomputer To Be Built At ORNL · 2004-05-12 07:23 · Score: 2, Informative

well, 0-4 are all true.

comparing this to early crays is a little difficut though. For the early crays one advantage was vectors and the other was pipelines.

vector processors are cool, because they tend to be much more tolerant of the latency. You issue a load command, and it does loads until the vector-register is full. Equivalent to dozens of loads (and dozens of round trip latency to memory) on a scalar architecture. The same thing applies to the execution units. You tell the CPU ADD R1 R2 R3, and it pumps the first elements of R2 and R3 registers through the ALUs and into R1 and keeps working until it gets through all of the elements in the vector. Later models supported chaining, which allowed the output from one of these operations to feed into the input of another operation. Vector CPUs are very good at keeping the ALUs busy.

The other advantage of the early crays was pipelining. YMP designs, for example, had multiple integer, FP, load/store, and reciprical devide units. All of these (and the dispatch unit) were pipelined, allowing a munch higher clock rate than traditional designs. Multi-pipeline designs are now the norm, (powerPC, Pentium, MIPS, etc.) but were pretty amazing at the time.

The cooling, incidently, was necessary at any clock rate. Early Crays. (well right on through to the T90) used bipolar transistors, rather than CMOS. In this sort of logic you switch current rather than switching voltage. The net result is that the early crays used a TON of electricity and needed massive cooling systems.

Re:good stuff on World's Fastest Supercomputer To Be Built At ORNL · 2004-05-12 06:58 · Score: 1

Well, that may not be true. The $5 price tag for the VT cluster is only for the hardware. A lot of their people costs were not exactly factored into a lot of the press coverage. Furthermore, ORNL can't pay grad students a $15/year stipend to administer the machine. No matter what they buy they need admins who are capable, well paid, and have security clearance. On big cluster systems, the lifetime labor costs often rival the initial hardware cost in total dollars.

Last I heard, the VT cluster had done little or no real computation, just benchmarks and tinkering. While other sites have put clusters to work on real problems, they tend to have a lot of system down-time for maintenence. Compare that to the cray T3Es (predecessor to the X1) which average better than 90% utilization.

Clusters are cheap, but you get what you pay for.

Re:Neither do regular cars on Hybrid Cars Don't Live Up to Mileage Claims · 2004-05-12 03:45 · Score: 4, Interesting

Absolutely correct. If you accelerate very slowly, keep that engine running at low RPMs, only drive on flat surfaces, coast whenever possible, then you might approach the published numbers. My car is rated 24/28 or something. Realistically I average about 23-24 with mostly highway driving. I think most consumers are aware of the extreem optimism of those numbers on any type of vehicle.

Re:Huh? on World's Fastest Supercomputer To Be Built At ORNL · 2004-05-12 03:16 · Score: 2, Informative

The important part of the statement is "Sustaining". There are a lot of computers out there on the top500 list that get peak numbers way ahead of their sustained numbers. An Army reseach center (www.arc.umn.edu) published a comparison of a xeon cluster and the X1. For their codes (weather simulation, material sciences, air flow, etc) the Xeons sustained performance was 5% of peak. The Cray was about 30% of peak. (this is probably due to the really awesome memory bandwidth of the cray)

You're correct that these are just numbers so lets talk about a real problem. The AHPCRC reported that a 32 processor cray X1 (peak 400 Gigaflops, 66 gflops realized) was able to simulate a weather model of the entire US with 33 vertical levels at 5Kilometer resolution in just under 2 hours. Today these models are done at 10KM resolution with 20 levels. IF you take this theoretical ornl system and assume (peak 60-80TF, 40 sustained on easy codes, 15 sustained on hard codes) then they might do a 2KM simulation with 45 layers in 1 hour.

Re:Cray X1 OS is.. on World's Fastest Supercomputer To Be Built At ORNL · 2004-05-12 02:57 · Score: 1

I geuss. It's right in the user manual, which is published on the website. They don't run around with Irix pom-poms and wave little irix flags, but they aren't shy about it.

Cray is a company that sells to huge research labs, and fortune 500 companies. Just because they don't appear on TomsHardware, or do interviews for /. doesn't mean they aren't saying anything about it. "know your audience", and all that.

Re:Fighting the temptation ... on World's Fastest Supercomputer To Be Built At ORNL · 2004-05-12 02:49 · Score: 3, Informative

The SGI altix runs a hacked up version of linux that's part 2.4 with a lot of backported 2.6 stuff as well as the Irix scsi layer. They are migrating to a pure 2.6 OS soon. The IBM system runs AIX 5.2. The Cray runs Unicos, which is a derivative of Irix 6.5, though they seem to be moving to Linux also. I'm gonna geuss that they run totalview as their debugger. They use DFS as their network filesystem. They have published plans to hook all these systems up to the Stornext filesystem which does Heirchical Storage Management. MPI and PVM are likely important libraries for a lot of their apps.

For these sorts of machines, one can by utilities for data migration, backup, debugging, etc. However, the production code is written in-house, and that's the way they want it. Weather forcasting, for example, uses software called MM5, which has been evolving since the Cray-2 days, at least. A lot of this code is passed around between research facilities. It's not open source exactly, but the DOD plays nice with the DOE, etc.

The basic algorithms have been around for a long time. In the early 90's, when MPPs and then clusters came onto the schene, a lot of work was done in structuring the codes to run on a large number of processors. Sometimes this works better than other times. Most of the work isn't in writing the code, but rather in optomising it. Trying to minimize the synchronous communication between nodes is of great importance.

Re:Cray X1.. What role do IBM and SGI have? on World's Fastest Supercomputer To Be Built At ORNL · 2004-05-12 02:39 · Score: 2, Informative

ORNL already has a 256 processor X1, a large IBM SP made of p690s, as well as a large SGI altix. I imagine the 50Tflops number will be a combined system with upgraded systems of all three types. They are obviously impressed with both the X1 and the Altix. The IBMs are no slouch though, and they are upgrading the interconnect, and IBM is just getting ready to launch a power5 update.

It's probably just spin to call the project "A computer", rather than "several computers". Deep in one of those ORNL whitepapers you see that they are planning to cluster together these three machine's with a cluster filesystem. You throw in a clustered batch control system and you can kinda call it "A" supercomputer. Really it's a cluster, except each of the nodes may have a thousand processors. We'll have to wait and see what it really looks like.

Re:They better hurry ... on World's Fastest Supercomputer To Be Built At ORNL · 2004-05-12 02:08 · Score: 4, Informative

Two radically different designs, will probably solve very different sorts of problems. Linpack is extremely good at giving a computer an impressive number. It's the sort of problem that fills up execution piplines to their maximum. Blue Gene was origionally designed to do protein-folding calculations. While many other tasks will work well on that machine, others will work very poorly.

It's a mesh of a LOT of microcontroller-class processors. The theory being that these processors give you the best performance per transistor. Thus you can run them at a moderate clock, get decent performance out of them, and cram a whole hell of a lot of them into a cabinet. It's a cool design, I'm interested to see what it will be able to do, once deployed. However, for the problems they have at ORNL, I'm sure the X1 was a better machine. Otherwise they would have bought IBM. They already have a farm of p690s, so they have a working relationship.

Re:Talking out my ass here, but on World's Fastest Supercomputer To Be Built At ORNL · 2004-05-12 02:01 · Score: 4, Interesting

If you care to, read the pdf on their early impressions of the X1. The Army High Performance Computing Research Center (www.arc.umn.edu) did an analysis of their application and found that the X1 was actually MORE cost effective than a commodity cluster.

Firstly, the X1 was greater per-processor performance by a factor of 4. Then you add an interconnect that has half the latency, and 50 times the bandwidth of myrinet or infiniband. It also has memory and cache bandwidth enough to actually fill the pipelines, unlike a Xeon which can do a ton of math on whatever will fit in the registers. Some problems just don't work real well on clustered PCs, they need this kind of big iron.

Secondly, some problems cannot tollerate a failure in a compute node. IF you cluster together 10,000 PCs, the average failure rate means that one of those nodes will fail about every 4 hours. If your problem takes three days to complete, the cluster is worthless to you. A renderfarm can tolerate this sort of failure rate, just send those frames to another node. Some problems can't handle it.

Oak ridge is very concerned with getting the most bang for the buck.

Re:Defragmenting filesystem? on Linux Filesystems Benchmarked · 2004-05-11 06:17 · Score: 1

You DON"T want to add that sort of complexity to a RAID device. The point of the RAID is to make sure that your bits never go away. Reliable is key, then fast. Playing tricks is just asking for trouble. From my experience RAIDs are flaky enough as it is.

If you look at the proposed SCSI extentions for OST (object storage target) that are being pushed by lustre and panases (among others) there is something like this built into the storage devices. Basically you don't reference a file by inode / block list or inode / extent list. Instead the file is an object with a handle, and the disk is responsible for allocating and keeping track of the actual bytes that the file consumes. Basically you move some of the brains out of the filesystem and into the disk. I'm not sure it's a good idea, especially when you need to use lots of disks, but it's pretty cool.

Re:HFS+ on Linux Filesystems Benchmarked · 2004-05-11 06:10 · Score: 1

I don't have benchmark numbers in front of me, but HFS+ is a 2nd generation filesystem with a journal hacked onto it. It's more analagous to EXT3 in that way, but not as fast. It falls into the same category of Mature, stable, reasonable performance for most everything, but excels at nothing.

You can crank up the ingest of Final Cut Pro/HD to the point where the filesystem causes it to drop frames. To be fair, however, it still gets the job done.

Re:Speed means absolutely nothing on Linux Filesystems Benchmarked · 2004-05-11 03:50 · Score: 2, Interesting

While GFS is a fine filesystem for a cluster, I'm not sure what use it will be in a non-cluster environment. There's a lot of things you have to do in a cluster filesystem to ensure data consistency between nodes. It is true that a non-cluster GFS can replace a lot of these functions with a nop, but it still affects the way you structure the code.

GFS has a nice, relatively asynchronous journal implementation. However, I don't know that it will perform well on small file I/Os, particularly deletes. It's also somewhat complicated to configure and manage. Seems like a real bother if you're not going to use it in a cluster.

Re:So why does RedHat/Fedora continue to push EXT3 on Linux Filesystems Benchmarked · 2004-05-11 03:15 · Score: 5, Interesting

Well, the simplest answer is that Stephen Tweedie is their filesystem guru, so why not use his baby in their OS. However, that's not the real answer. SCT is a clever guy, and mature enough to not let pride get in the way of the best possible system. (a similar question: Why does sun still use UFS?) For Redhat, EXT3 is probably the best general purpose filesystem, particularly for the root drive. Redhat is interested in selling on servers, where the root filesystem is not the bottleneck. You install the OS onto EXT3, which has decent performance and is very mature. Then you install your database / exported directories / mail spool / whatever onto the filesystem that is best for that job.

Ext3 is a very close cousin to Ext2, which has been around for a very long time, and changes very slowly. Reiser has grown and changed a LOT in the last three years, including some metadata changes that effect on-disk structures. Though it has stabalized lately, Redhat is correct to be cautious. XFS and JFS, though very mature filesystems on other OSes, have only recently become tightly integrated with the Linux kernel. Though technically controlled by the linux kernel community, all three of these other filesystems are really controlled by little cabals of people within IBM/SGI/ and then Hans Reiser. While these groups try to be transparent in their development process, Ext3 is very transparent in its development and direction.

One other tremendous advantage that Ext3 inherits from Ext2 is a fast, versatile, and effective fsck program. Journals are great in the event of power failures. However, they do not protect against Windows, or a faulty fibre channel driver, or uninformed sysadmin who accidently writes over the first 1 MB of the disk. Fsck.Ext2 is one of the best around.

Slashdot Mirror

User: flaming-opus

Comments · 368