fgodfrey · Slashdot Mirror

Re:Can we see the evidence? on Power Outages Strike East Coast · 2003-08-14 11:51 · Score: 1

If you didn't understand "complex systems fail in complex ways" there's not a whole lot more I can say. I don't use Windows, but I do use very large computers. You would be amazed at how hard it is to figure out why something that complex fails.

As an aside, CNN just reported that the Canadian Prime Minister said that a lighting hit at a Niagra Falls power plant started this. That would be the answer to your previous question of "why today".

Re:Can we see the evidence? on Power Outages Strike East Coast · 2003-08-14 09:52 · Score: 1

Why does your computer crash randomly? It worked yesterday. The power grid is a very complex system. Complex systems fail in complex ways. As someone else pointed out, it takes a hot day coupled with a badly timed equipment failure. We share some power grids with Canada so their failures can spread here just as ours can spread there. If you want to talk about security between the US and Canada, I don't think the power grid is the biggest problem either...

Yeah, it'd be nice if they did some upgrading of the grid, but then, most of my software failures happen immediately after upgrades so that may not be the best idea either...

Re:explain on Time For A Cray Comeback? · 2003-08-06 03:49 · Score: 1

I hate to break it to you, but Intel and IBM/Apple/Motorola are, in fact, using vectors to increase the speed of their commodity processors. Just because it's not in the Intel x86 instruction set doesn't mean it's a bad idea.

Re:explain on Time For A Cray Comeback? · 2003-08-06 03:46 · Score: 1

I can't say where the numbers came from, but it wasn't from marketing. What percent of peak you get depends heavily on what problem you're trying to solve. I don't happen to know the problem(s) that were getting 10% of peak and calling it good, but I suspect they were high bandwith problems.

As for the "it's worth it if it only costs 5% of the vector machine", for small clusters that may be true. For large clusters, they actually cost very close to what our vector machines cost. All those Myrinet/Quadrix switches aren't cheap either...

Re:explain on Time For A Cray Comeback? · 2003-08-04 10:09 · Score: 5, Informative

As other replies have posted, bandwidth is the big issue. And by bandwidth, we are talking bandwidth of the processor to memory. Cache is great and all, but if you are stepping through gigabytes of data (or in some cases terabytes of data), your problem isn't going to fit in cache. The speed of your processor will then be dominated by the speed at which it can get to main memory. On a PC, that's slow. What's even slower is when you have to exchange data to a remote node in the cluster. Current massively parallel supercomputers (which is pretty much all of them) have phenomenal bandwidth between processors and memory and between nodes.

Second, (yes, I work for Cray so now I'm going to put in a sales pitch :) our processors are vector processors. As such, you can hide a lot of the latency of getting to memory by queueing up 64 loads at once. Short length vectors are what is used by MMX and Altivec to accelerate graphics. With sufficient vector operation chains, you can keep the processor busy all the time. You can't do that on a PC. I've heard (no, I don't have actual links to articles) that 10% of peak performance on a cluster is considered really good. Our customers wouldn't consider that anywhere near "really good".

Finally, there's memory. Lots of it. A single system image supercomputer can have terabytes of memory in one kernel image. You're simply not going to get that in a single PC cabinet.

Finally, in case anyone doubts that vectors, big memory, and large bandwidth can make a good system, the fastest machine in the world right now is the Japanese "Earth Simulator" machine which is an NEC SX machine. That is somewhat similar in architecture to a Cray in that it has large bandwidth and vectors.

Re:unbelievable. on RMS Calls On Linux Developers To Replace BitKeeper · 2003-07-19 14:31 · Score: 2, Informative

BitKeeper does a *lot* of things that CVS can't. Probably the single biggest improvement is something that we call a "mod" and I think BitKeeper calls a "changeset". It's a collection of files that you check in all at once and can track as a single entity. You can create multiple lines of development and merge changesets between them. On the surface this doesn't sound very useful but when you have a large development group, trying to figure out "what else changed when this file changed" is very nice. We use a system with similar features to BitKeeper and those capabilities are invaluable debugging tools. Our system, and I'm pretty sure BitKeeper as well, can update local workareas much faster than CVS which can be glacial.

Re:A very GOOD THING [TM] on SGI Releases New Workstations · 2003-07-15 09:03 · Score: 1

Uh, I'm not sure I know what "Chimera" and "Voyager" code names are, but a) aren't these SN1/MIPS based machines? and b) regardless, the memory controller ASICs in all the NUMA systems are designed for sci/tech as they're all the same ASIC within a class of systems. Ie, all the Origin 3000 based systems use the same ASIC and all SN2 based systems use the same ASIC. Both were designed with scientific computing in mind, these are just smaller configurations of the same thing.

Re:A very GOOD THING [TM] on SGI Releases New Workstations · 2003-07-14 17:11 · Score: 2, Interesting

See, this is what I get for not using "preview". Right after the "I digress", insert the following:

The link to local memory is even faster. When you are doing scientific computing, ie. what these machines are sold for, odds are your problem isn't going to come close to fitting in cache in which case your poor P4 is going to spend 50% or more of its time waiting for the results of loads from memory.

Re:A very GOOD THING [TM] on SGI Releases New Workstations · 2003-07-14 17:08 · Score: 4, Insightful

So I assume that your Pentium 4 comes with up to 1 Terabyte of RAM and 512 processors (well, ok, so you'd have to go to an Origin 3800 with the graphics pipe to get 512p) in a single system? 'Cause that's what the Onyx4 can be purchased with. Also, SGI hasn't used 400 MHz processors for a few years. I'm not up on their current CPU's but another reply to your post indicates that it's 700 MHz.

Also, this thing can move more bandwidth back and forth to memory than your PC can dream of. The link between nodes is 1.6GB/sec full duplex ( Of course, we over at Cray can do 16 times that but I digress
So the moral is, while you can sort of get away with doing a MHz-MHz comparison on two different processors, the overall architecture of the system is what counts if you really want to get work done. This is why SGI and Cray are still in business.

Re:There is no protection, one will be sunk on USS Ronald Reagan Commissioning Tomorrow · 2003-07-11 06:34 · Score: 1

Um, you only get to keep firing while you're still alive. I strongly suspect that a carrier group isn't going to just sit there and let you keep firing without responding, and probably responding *very* violently.... Plus, it'd take either a large number or a very large missile to take out a ship that big.

Re:IANARS but... on Linux Rocket Blasts Off This Fall · 2003-06-10 06:54 · Score: 1

When you buy a license to use a realtime OS, you almost always get the sourcecode since it may or may not already run on whatever esoteric hardware you have. Certainly, anything that carries people, NASA would have the source code for. Having open source software probably helps reduce bugs, but for rockets, you need *no* bugs.

Case in point: The Arianne 5 used the same flight control software as the Arianne 4. Only it accelerated faster. And the number didn't fit in 16 bits anymore. *boom* Whoops.... And that was in software that had been checked, rechecked, etc. Who knows what bug you might find in any large codebase when you use it for a new purpose?

Sound Techs (Off topic) on Today's SCO News · 2003-05-30 05:29 · Score: 1

Hey, we're just smart enough to know that if we count to 3, we have to lift something :)

You may find this amusing though :)

Re:Um, this can't be right on SGI Announces Restructuring, Cuts 400 Jobs · 2003-05-23 03:14 · Score: 1

You might wanna call and make sure that offer is still valid. In past layoffs at SGI, I knew of some people who were extended offers that were then revoked due to the position (or even the entire product) being axed...

Re:Big machines, big users on SGI Announces Restructuring, Cuts 400 Jobs · 2003-05-23 03:08 · Score: 1

Well, I see you bought the SGI marketing BS.... CrayLink has absolutely *nothing* to do with Cray and was actually basically designed before the purchase. Origin 3000, which was designed after the purchase, has far more Cray influence. The Altix 3000 (announced a few months ago) was basically designed by the same group that designed the Cray T3E.

So why *did* SGI buy Cray? That's a question a *lot* of people around here would like answered. They used very little of Cray's technology and then sold most of it to Tera, which is now Cray, Inc. (where I presently work).

As for CrayLink (currently called NUMAlink), it is a *very* good interconnect. It's modular, supports just about any front side bus protocol (the interconnect doesn't directly run the front-side-bus), scales to basically arbitrarily sized machines, and it's fast. Sun's interconnect technology has scaling issues and IBM's isn't a direct memory system level interconnect.

As for what SGI got out of Cray... Basically, if you look at who is designing their products these days, it's all ex-Cray Research people. So they got people. And a boatload of cash which they burned through. I'd say they got their money's worth. Sadly, they never really knew what to do with Cray.

Finally, for a little background, I worked on diagnostics for the Origin 2000, partitioning and RAS for the Origin 3000, a Merced ccNUMA box that was similar to the O3000 but never shipped, and a little on the Altix 3000. I now work for Cray on RAS on the X1.

Re:"Managerspeak"?! on Self-Repairing Computers · 2003-05-13 16:00 · Score: 1

Right, I have those also. The key there is "personal computers" which have 1 processor (maybe 2 or 4 if you have more spare cash than I do). These are quite small. There is no way that you are going to be able to build a machine of the size I usually work work with (256 processors is on the low end of that) that "never" breaks. It *is* impossible. The statistics eventually catch up with you. Even if the software is perfect, the hardware *will* break, it's only a matter of time (and not as much as you might think). That's where all the error detection, correction, etc. is absolutely required.

Re:"Managerspeak"?! on Self-Repairing Computers · 2003-05-12 04:41 · Score: 2, Interesting

No, it's not (well, debugging software is definetly good, but writing "self healing" code is important too). An operating system is an incredibly complex piece of software. At Cray and SGI a *very* large amount of testing goes on before release, but software still gets released with bugs. Even if you were, by some miracle, to get a perfect OS, hardware still breaks. In a large system, hardware breaks quite often. Having an OS that can recover from a software or hardware failure on a large system is essential to keeping the system running.

The software that I'm responsible for, in fact, is specifically designed to detect, report, and try to work around errors. We have code to detect a processor hang (through software or hardware failure) and remove it from the running OS image, etc. The Cray T3E (which I didn't work on) can warm-reboot an individual processor on either a software or hardware panic/hang and reintegrate it into the running OS.

Re:No on Apple Introduces iTunes Music Store, iTunes 4, new iPod · 2003-04-28 17:46 · Score: 1

A few points here (I am not a recording engineer, but I have been around and involved with pro audio gear for about 15 years now, and by pro audio, I'm talking stuff that would go out on national tours of big name acts, not stuff you buy at Radio Shack):
a) The recording studio foam is, in fact, rather expensive. I'm not sure why, but it is.
b) You can't replace an entire recording studio with a computer. You still need a microphone. Probably more than one. I haven't checked prices recently, but I'll be high end recording studios probably buy mics that cost as much as $20k and they'll have more than one. Assuming you can't actually hear the difference between that and a "cheap" condenser mic and you're still looking at $500 per mic. You'll also want a high quality A to D converter and a set of serious studio monitors so you can hear what you just mixed. This all assumes that you can find a free multitrack digital editor that has good effects (if you find one, let me know - I could really use one :) 'cause ProTools isn't cheap either.

c) Finally, you claim that songs on p2p move without advertising. While this is true for indie songs, the RIAA wouldn't care if their songs, which *they* are advertising, weren't moving. So a lot of the songs *are* in fact advertised (of course meant to be purchased in another medium, but still).

Re:....what the hell..... on The Rutan SpaceShipOne Revealed · 2003-04-18 07:08 · Score: 4, Informative

Err, escape *velocity* is always high regardless of what kind of flight you are using. You need to reach a certain speed to achieve orbit. What I think you were trying to say is that the forces the craft absorbs (ie, the acceleration) only are massive if you have to blast the thing into orbit. Once you've used the aerodynamic lift to get into the upper atmosphere, there's less wind drag and you're already moving at some amount of speed so you need less fuel to accelerate to orbital velocity and there's less stress put on the craft by air moving over it.

Your example of going 1mph all the way to "orbit" doesn't work 'cause you won't *be* in orbit at 1mph. Being in space and being in orbit are two very different things.

Re:Congratulations to the Linux Developers on 2.5.65 On 32-way NUMA-Q with Preempt Enabled · 2003-04-09 20:36 · Score: 1

Irix comes to mind as an immediate example, and probably every single hard real time OS out there like VxWorks. You can't support hard real time without preemption of kernel threads as your user service may be more important than said kernel thread (think about a program that decides when to lower the flaps on an airplane vs. the kernel thread that flushes dirty buffers to disk - clearly, you'd want the "lower flaps" thing to have priority!)

As an asside, Linux has been running on large NUMA systems before. The SGI Altix 3000 and a predicessor that was never released (that I worked on) have run Linux on at least 64 processors in a ccNUMA configuration.

Re:The ASCII Play on Linux Enhances Shakespeare · 2003-03-24 04:41 · Score: 2, Insightful

The superstition allows you to use the title as long as you're performing it. However, I was a little bit nervous when I stage managed Macbeth on Friday the 13th... Fortunately, no actors required major medical attention after the sword fights :)

Now, Slashdot may have some trouble since they have now used the name without the performance so if you read that a fire destroyed all the computers in the /. cage and no other equipment, don't be surprised :)

Re:Too bad on Apple Terminates Safari Seed Program · 2003-03-22 18:39 · Score: 4, Insightful

Every time there's a /. article on "so and so released a beta of product X", someone comes along and makes this "Oh, they're just offloading testing" argument. The truth is, they have to have tested the thing in house beforehand, but users somehow manage to find bugs that your testers never do no matter how much testing is done. Releasing a beta gives the company a chance to get the product into the hands of people who a) Will "test" it in ways nobody at the company ever thought of and b) realize that there may be some problems.

I'll bet if you did a "study" of version 1.0 of product with public betas and without, you'd find that the ones with public betas have fewer bugs.

As to whether they are doing anyone any favors, I suspect that corporate IT departments like public betas because it gives them the chance to test the product before some bozo in management demands it be installed immediately the day it's released or the world will come to an end.

Re:al gore _did_ invent the internet on Al Gore Joins Apple's Board Of Directors · 2003-03-21 10:29 · Score: 1

No, but he certainly was in the Senate in the '80's when the government allowed corporations to get onto the Internet. So, no, he didn't sponsor a bill that resulted in TCP/IP being created, but it did allow for the 'Net to become the way it is today instead of a "small" group of universities and government sites. The statement he made somewhat overstated his role, but it wasn't a lie.

Re:al gore _did_ invent the internet on Al Gore Joins Apple's Board Of Directors · 2003-03-19 11:20 · Score: 1

Yeah, sorry, apparently my memory wasn't quite as good as I thought it was :)

Re:al gore _did_ invent the internet on Al Gore Joins Apple's Board Of Directors · 2003-03-19 11:11 · Score: 1

I hate replying to my own posts, but I screwed up the quote. Someone else got it right, further down the thread. The basic idea, though, is right - he never actually claimed to have invented the Net.

Re:al gore _did_ invent the internet on Al Gore Joins Apple's Board Of Directors · 2003-03-19 11:07 · Score: 3, Informative

Actually, the comment he made was, in fact, correct. The media misquoted it and for inexplicable reasons, Gore never challenged it. The direct quote was "As a member of the Senate I introduced the legislation that created the Internet" which, while maybe a bit self promoting, was what happened. He was one of the sponsorors (sp?) of the bill that opened ARPAnet to the public which created the internet as we know it. So, really, he never claimed to have invented anything...

Slashdot Mirror

User: fgodfrey

Comments · 356