Greg+Lindahl · Slashdot Mirror

Re:Someone did on Cray for Sale - Cheap - Some Assembly Required · 2000-09-02 09:28 · Score: 2

There were clusters of Crays before War Games
came out.

Re:this is turning into WTO all over again. on 2600 Staffer Arrested During Republican Convention · 2000-08-06 04:17 · Score: 2

In the 1980's, police successfully handled protests despite all the examples of violent protests in the 1960s. So no, I don't think a race riot in 1992 had a significant effect on why the police are beating the crap out of non-violent protesters in 2000.

Re:A perfect machine for render land and other use on SGI And /Massive/ Linux Machine · 2000-08-06 04:15 · Score: 2

Huh? This system can do genetic pattern matching, but it's far less cost effective than a pile of small machines. Fortunately, the people who actually spend millions of dollars on machines to solve problems like gene matching investigate the problem more carefully than your friend.

Two companies doing this problem are Celera Genomics and Incyte. Incyte has a cluster of 1,200 x86 machines (3,000 cpus) running Linux. Celera Genomics has a cluster of 1000 Alpha cpus in 250 nodes; Celera purchased their machines before it had been shown that Linux could handle that kind of task.

And a company that specializes in getting fast storage for the movie industry is MountainGate.
I'm not so sure that even the rendering example is really valid. Much rendering treats rendering as an embarrassingly parallel problem: invidual frames slow, entire movie fast. That's much more cost-effective.

Re:this is turning into WTO all over again. on 2600 Staffer Arrested During Republican Convention · 2000-08-06 00:05 · Score: 3

And once again, we meet that fine line in the sand. Lets look at WTO. There were days of peaceful protest. There were also groups who showed up, promising to cause a disruption and destroy property. So how are you expected to deal with the protests as a whole?

I expect the police to enforce the law. In the 1980's, there were MANY protests which were largely non-violent, with a few violent people. The police dealt with most of them quite well. It's only now that "police riots" are happening repeatedly.

So much for the lessons of the past. And so much for police professionalism. Ready, aim, lawsuit!

Re:No Way on EU To Take Legal Action Against Microsoft · 2000-08-02 20:40 · Score: 2

Wouldn't any company that wants to be successful? No, successful companies obey the law, including the anti-trust law. Dominant companies don't get to play the same games that non-dominant ones do.

Re:Several Options... on Distributed Operating Systems? · 2000-07-31 03:29 · Score: 2

Mach is the granddady of distributed OS work? Heck, Mach wasn't even the first distributed OS developed at CMU. Hydra pre-dates it by more than a decade. Bill Wulf did quite a bit of work on it. The successor to Hydra is Legion, at the University of Virginia.

Distributed OSes are here on Distributed Operating Systems? · 2000-07-31 03:20 · Score: 4

There are several real, full-featured distributed operating systems out there. One good example is Legion. It gives you the illusion of running programs on your desktop, while they are actually running lord-knows-where. Yes, you often need a lot of network bandwidth to get good results. Depending on the exact details, you can run programs on other machines with either no or small modifications.

Lest you think this has nothing to do with today's operating systems, the Linux desktop folks have started using Corba quite a bit to link things together. Well, Legion provides much more powerful, secure, and reliable ways to do the same thing, in a much more consistant fashion.

Re:Its not intel's fault on Are Buffer Overflow Sploits Intel's Fault? · 2000-07-29 01:30 · Score: 3

Almost every 386+ OS has not used segments the way Intel intended. So yes, they've had quite a few years (more than a decade) to add an execute bit, if they actually cared.

Re:why a museum at NSA at all? on Ask The NSA About Certain Things · 2000-07-27 05:47 · Score: 2

What makes you think that the Smithsonian wants a huge NSA exhibit, as big as the NSA museum? The Smithsonian has limited funds, just like everyone else.

The Smithsonian dropped by the University of Virginia astronomy department and looked at the 5 generations of astronomical photographic plate measuring devices we have in the basement of our observatory, gathering dust. "Hey, you should build a museum for this. It's important stuff and should be preserved." Well, they didn't have the money to do it, and neither does UVa, but UVa hasn't junked the equipment; they're keeping it in a climate-controlled building until someone decides they care.

Re:Wardrobe predictions on Answers About The New NOAA Massive Linux Cluster · 2000-06-01 21:51 · Score: 2

Yes and no. That instrument is a barrel piano or barrel organ, which only plays preprogrammed tunes. It was played by turning a crank, so it got named after the stringed instrument which preceded it.

I play the original, not the modern kind. In fact, the original stringed instrument survives to the modern era in French folk tradition.

Re:CentraVision's license? on Answers About The New NOAA Massive Linux Cluster · 2000-06-01 01:33 · Score: 2

If you have the budget for Fibre Channel fabrics at some point, at least look at the Global File System.

Our storage is Fibre Channel, and we did evaluate GFS. We found that CentraVision was superior for this customer, mainly because GFS didn't have journaling at the time. GFS may yet become quite superior.

And there are much larger SPs around and coming, like San Diego's and the second phase of NERSC's.

Myrinet has superior scaling when compared to the SP switch, or the T3E switch for that matter. The T3E switch did have higher bandwidths and lower latencies, but for many real supercomputing problems, Myrinet does the job for far less money.

The biggest way that this system doesn't compare well with the T3E is in programming models -- the T3E also supports the SALC model, shared address local consistancy. I hope to support that in around 12 months.

The IBM SP doesn't support the SALC model, and has inferior per-processor bandwidth and latencies.

Re:CentraVision's license? on Answers About The New NOAA Massive Linux Cluster · 2000-06-01 00:31 · Score: 3

CentraVision is a traditional proprietary product.

All they've released for Linux so far is a client, which is a kernel module. I'm not sure if they're going to release the metadata server for Linux.

Re:Checkpoint/restart on Answers About The New NOAA Massive Linux Cluster · 2000-06-01 00:28 · Score: 3

The first answer to your question is that we never have scheduled maintenance. Since the machine isn't monolithic, we can repair most parts while it's live.

This machine is nothing like the SGI SN-IA architecture. SN-IA is still shared memory, and has a significantly faster network (which is far less scalable and far more expensive). Whenever you share memory, you share failures.

We did provide a user-level checkpoint feature to FSL, but it requires the user to modify their program. Kernel-level checkpoint is on our list of things to do. It's not that hard for single processes -- Condor does it, for example -- but it's fairly tough for programs that use MPI and run in parallel.

Re:Cluster communication question on Answers About The New NOAA Massive Linux Cluster · 2000-06-01 00:21 · Score: 3

A PCI bridge would itself pretty much be a network. Myrinet is a great interconnect and is probably much better than any big PCI bridge that you could come out with. The new InfiniBand specification allows bridging of the successor to PCI, but I suspect Myrinet's successor is going to be a better interconnect by the time InfiniBand machines are available.

This system does run regular software without recompiling. It just doesn't use a lot of CPUs for simultaneous compute unless you change the code to use MPI. But they can access the shared storage at high speeds without any change, and they can get farmed out to separate CPUs without any change.

Re:Wardrobe predictions on Answers About The New NOAA Massive Linux Cluster · 2000-06-01 00:14 · Score: 3

A yellow dress?! Those are pluderhosen, not a dress. Pants. With pockets. Worn by manly Elizabethan men, who carry sharp pointy sticks to poke people who accuse them of wearing dresses.

Hurdy gurdys aren't the same as organ grinders. I don't think monkeys were a part of the act in the 16th century.

Re:Mmmmm ... weather on Answers About The New NOAA Massive Linux Cluster · 2000-05-31 23:42 · Score: 3

They tell me that the first external users (20% of the machine) are going to be ocean modelers. But I think the FSL guys would disagree that it's far more interesting... all of these guys are pretty fanatical about what they do!

Re:Strange answer to distributed.net question. on Answers About The New NOAA Massive Linux Cluster · 2000-05-31 23:37 · Score: 5

I think Greg's answer to this question, i.e. not understanding that the question was about running simulations outside of his cluster, is indicative of the "we've got to run our jobs on somthing that sits in a big air-conditioned room on our site" mentality.

You must be a great mind-reader.

No, I don't have a "big air-conditioned room" mentality. In fact, Legion is capable of harvesting unused processor cycles in a much more sophisticated fashion than distributed.net. However, weather forecasting needs too much bandwidth. You have to consider problems on a case-by-case basis for such a low-bandwidth system; most traditional supercomputer problems aren't appropriate.

This doesn't mean I think distributed.net isn't cool -- it's very cool, light-weight, and it gets its job done. It shouldn't be a surprise that it can't solve every problem.

Re:That last question... on Answers About The New NOAA Massive Linux Cluster · 2000-05-31 23:27 · Score: 3

No. As I pointed out, weather codes require a fair amount of bandwidth, much more than that's available in a distributed.net situation. In addition, most weather codes assume that they're running on a uniform machine, so they'd have load-balancing problems if run on a distributed.net type system.

first? on Ask the Man Behind the NOAA's New Beowulf Cluster · 2000-05-22 23:02 · Score: 4

first post?

Re:Defeating Trade Secrets 101: on Kerberos, PACs And Microsoft's Dirty Tricks · 2000-05-02 04:25 · Score: 1

The original poster meant that the document is copyrighted, not the concepts in it.

Right. Copyright is for published material. Trade secrets can't be published. As I said, you shouldn't play lawyer on /. if you don't know what you're talking about. And don't trust me, I'm not a lawyer either. But I paid attention back when AT&T was suing Berkeley over BSD. At the time, AT&T was asserting that the Unix source code was a trade secret, and wasn't copyright.

Re:Defeating Trade Secrets 101: on Kerberos, PACs And Microsoft's Dirty Tricks · 2000-05-02 03:22 · Score: 2

I wouldn't do that. It's still copyrighted,

Trade secrets can't be copyrighted. Consult a lawyer instead of playing one on /.

Re:Why it cost $15 million on New Linux Supercomputer Forecasts Rain · 2000-04-25 19:15 · Score: 1

First, on what statistics did you beat SGI's machines on

We beat SGI on performance on the customer's actual codes. If you have 1/10 the MPI latency and your machine costs 3 times as much, and the customer's codes don't get much of a benefit from reduced latency...

The biggest nit I'm going to pick is your assertion of running a single system image. That is true only if you can migrate processes between nodes in the cluster or transparently change your interconnect fabric to keep nodes running the same job physically close.

You're pretty confused about what a "single system image" can be to different people. Try reading Greg Pfister's book. By the way, Myrinet's CLOS topology is good enough that it doesn't matter where in the machine a job's processors are. That's an important factor simplifying the software that the FSL machine needs to get high performance. FSL tested for inter-job contention, and I suspect SGI flunked. The machine they bought had near-zero inter-job contention.

Further, it's not exactly an SSI if the sysadmin has to install the oS on every node or has a seperate console connection to every node.

We provide tools that give the sysadmin a single system image, too. There's nothing new there; people administering large clusters have had that for years.

Now, I'm not trying to say that clusters suck for all applications. They just aren't the solution to *every* problem, as a lot of people claim they are.

I never said that clusters were the solution to every problem. But a cluster was a solution to FSL's problem.

Re:Why it cost $15 million on New Linux Supercomputer Forecasts Rain · 2000-04-25 10:59 · Score: 1

I'll end with my sales pitch for traditional supercomputers

Please don't. We beat SGI's machines in the bid, and this machine provides both higher bandwidth than any SGI Origin machine (300 gigabits bisection bandwidth), and it also does provide a single system image for this customer, who only runs MPI programs. So numerous parts of your comment are wrong.

Re:Why it cost $15 million on New Linux Supercomputer Forecasts Rain · 2000-04-25 10:54 · Score: 1

The storage area network hardware is the usual DDN fibre-channel RAID combined with Broacade FC switches. That's not that exciting.

The software is the interesting part. It's the "CVFS" filesystem, which is from ADIC. They ported this filesystem to Linux for the FSL bid.

Re:Can I phone it and ask other questions? on New Linux Supercomputer Forecasts Rain · 2000-04-24 23:45 · Score: 1

That's what the on-site engineer does, answers questions like yours.

Slashdot Mirror

User: Greg+Lindahl

Comments · 213