Maintaining Large Linux Clusters

I am a cluster of one by Java+Geeeek · 2003-06-14 09:06 · Score: 3, Funny

My book on maintaing a cluster of 0-1 nodes will be out next month.

Re:I am a cluster of one by JDWTopGuy · 2003-06-14 12:28 · Score: 1

I am going to sue you with a beowulf cluster of lawyers because you stole my book idea!

--
Ron Paul 2012

but... by freedommatters · 2003-06-14 09:09 · Score: 0

bottom line should be...

"try doing it with a windows cluster"

john
are you a Weapon of Male Destruction? then you need one of our sassy t-shirts

--
All I Want For Christmas Is My Constitutional Rights

Re:Obligatory Posts... by nano2nd · 2003-06-14 09:17 · Score: 0, Offtopic

I'm not interested till it can do Ogg Vorbis!!

--
G4 Hackintosh

Re:"But why?" asked Little Johnny. by toft · 2003-06-14 09:19 · Score: 2, Insightful

"Why on earth would someone need a 1000+ node cluster?"

Look at google? :-)

Re:"But why?" asked Little Johnny. by heli0 · 2003-06-14 09:23 · Score: 4, Funny

Why on earth would someone need a 1000+ node cluster?

Maybe for a Large Hadron Collider-class computation.

--
Whenever the offence inspires less horror than the punishment, the rigour of penal law is obliged to give way...

Lucky bastards by Professor+D · 2003-06-14 09:25 · Score: 5, Interesting

#include "back-in-my-day-rant"

Damn. Back when I was on a high-energy experiment located in the middle-of-nowhere in Japan (subject of at least two slashdot articles), our japanese colleagues used to lease gaggles of Sun workstations at a yearly maintanence cost that exceeded the retail value of the machines themselves!!

A few of us linux-fans used to grumble that we'd be better off buying dozens of cheap linux-boxes, but we weren't making the buying decisions. It seemed to us that the higher-ups didn't think cheap boxes with a free OS could compete on a performance basis with the Suns.

As for me? I just installed CERNlib on my laptop and just laughed as it blew the suns away on a price/performance(+portability) basis

Re:"But why?" asked Little Johnny. by Bob+Wehadababyitsabo · 2003-06-14 09:28 · Score: 5, Interesting

Where I work, there is a 500 node Linux cluster for cladistic tree generation, which takes a lot of brute force and specialized tools to make happen. It is arguably much more complex then launching a rocket.

Just because you don't need it, or can't envision needing it, doesn't mean nobody else needs that kind of power.

Bob

--
fsck -u

Re:"But why?" asked Little Johnny. by StealthSock · 2003-06-14 09:28 · Score: 1

Maby if you wanted to watch Finding Nemo rendered in realtime instead of frame by frame? I don't know.

Autoassimilating Diskless Linux Clusters by Anonymous Coward · 2003-06-14 09:29 · Score: 3, Interesting

So yeah, I basically designed my own system for a professor in the Political Science Dept at my universidad Washington University in St. Louis that completely boots over the network and is completely diskless for every node. About a year before Knoppix ever started doing that. Did it with openMosix and its fully LAM/MPI functional. Bruce of the openMosix list was on me for quite a while to get the docs done, but some really not cool domesitc issues came up and I never got them done. If anyone is really interested, send an email to drtdiggers_DONT_SPAM_ME_BASTARDS_@_SUCKYMICROSOFT_ hotmail.com and let me know, I'll finish them up.

Re:Autoassimilating Diskless Linux Clusters by Anonymous Coward · 2003-06-14 11:03 · Score: 0

Wow, I'm shocked by the sheer maturity of your comment.

Man, what an ass.

I just saw it as an opprunity to give my little project some airtime. Its fully ontopic and fully relevant. Perhaps some more constructive critisim?
Re:Autoassimilating Diskless Linux Clusters by mprinkey · 2003-06-14 11:36 · Score: 1

I built a 60-node cluster about two years ago that originally had boot drives in each node. Since the gubment bought the boxes from the lowest bidder, we got very poor quality hardware to work with and about 75% of the hard drives died within a few months. To get things working again, I had to do the diskless boot thing. It was not at all fun. I can't see any advantage to that approach at all and I certainly wouldn't do it again by choice.
Re:Autoassimilating Diskless Linux Clusters by Anonymous Coward · 2003-06-14 11:49 · Score: 0

Really? What problems did you run into? The largest I an into were getting a mature kernel on the boxes, had to use PXE to do it. I as thinking of porting it to SYSLINUX. I just make ram drives for /dev, /var, /tmp, and /etc so it has a bit of a footprint, but in dual proc boxes with .5 G --> 1 G and higher RAM, the footprint was irrelevant.
Re:Autoassimilating Diskless Linux Clusters by Anonymous Coward · 2003-06-14 11:51 · Score: 0

OH! And the point that you may have missed! Its AUTOMATIC, its plug and play clustering, just have to tell the BIOS to do Netboot. So there is no trouble on the node end...
Re:Autoassimilating Diskless Linux Clusters by mprinkey · 2003-06-14 12:41 · Score: 1

These boxes had one of the extra-buggy version of PXE that just would not work at all. Ended up using a etherboot floppy in each system to get it up. I built shared, read-only /usr, /sbin, /bin, etc and /var, /etc, and /tmp came off of independent nfs shares. Maybe using ram drives for these would have helped.

The tftpboot thing was a bit of a mess to set up. Then Redhat simply did not want to let the normal boot process work, so I had to rewrite rc.sysinit basically from scratch.

Plus, I had to hack in a sleep command into the Etherboot floppy to keep all of the systems from booting simulateously. Doing so thrashed the NFS server and deadlocked the whole startup.

All together, I probably spent two full weeks getting that working correctly. Compare that to a netcat-based disk-cloning installation on an 80-node cluster that took me about 3 hours.

There are no doubt canned boot installs now, but when I did this two and half years ago, there were none that would work with software we use on the cluster. I got to see the "inside" of how all of this works and with 500+ nodes set up over 5 years, I can say with some certainty that I never want to do it again.
Re:Autoassimilating Diskless Linux Clusters by mprinkey · 2003-06-14 12:50 · Score: 1

It is automatic? Try it sometime with a four or five dozen systems and looming deadline. Try flashing a new BIOS onto 60 motherboards because there is a bug in the PXE code that shipped. The list of potential problems is pretty long.

Boot hard drives are cheap (~$50) and uses the boot mechanism that is most stable and well validated. People building ~1000 CPU clusters may well justify the cost savings and the additional setup work to get all of the kinks out of setup. People building "normal" clusters (10 CPU_COUNT 100) probably can't.
Re:Autoassimilating Diskless Linux Clusters by Anonymous Coward · 2003-06-14 13:41 · Score: 0

Good point. I didn't really have the resources to try it out on a bunch of different systems. I guess a better way would be to describe it as a Homogenous Autoassimilating Cluster setup :) Since the nodes were all the exact same hardware cfg, it was relatively easy. I had to rewrite rc.sysinit too, though I enjoyed learning how a *nix system actually boots up, I never knew (needed to know) before. But I can see the difference when you have a deadline looming over you, my timetable was pretty low stress, though interestingly enough, it took me about 2 weeks too. I'm curious, what were the specs on these machines you got in bulk?
Re:Autoassimilating Diskless Linux Clusters by mprinkey · 2003-06-14 14:49 · Score: 1

Dual P3 motherboards with onboard eepro100. Serverworks chipset. I think they were Supermicro 370DLE. Those were the only dual chipsets with decent memory bandwidth for 933 MHz chips. The BIOS that came on them was exceptionally buggy. I wasted almost an entire week trying to get PXE to work...it makes me angry just thinking about it. But that is good...it just reminds me of important lessons learned: No More Diskless Clusters and Tyan instead of Supermicro.
Re:Autoassimilating Diskless Linux Clusters by Daengbo · 2003-06-15 00:08 · Score: 2, Interesting

I don't claim to know more about your situation than you do, but several distros, including k12ltsp.org , support Open Mosix straight from the install, and work with either PXE (which you couldn't have used) or Etherboot. I'm not trying to change your mind. I'm just pointing out that there are a lot of folks who prefer and even swear by diskless clusters.

--
Put identity in the browser.
Re:Autoassimilating Diskless Linux Clusters by Anonymous Coward · 2003-06-15 04:02 · Score: 0

drtdiggers_DONT_SPAM_ME_BASTARDS_@_SUCKYMICROSOFT_ hotmail.com

I tried that address but it didn't work. :-(

Re:"But why?" asked Little Johnny. by hak+hak · 2003-06-14 09:30 · Score: 4, Informative

Because of the computations required to analyze the enormous amount of data a particle collider outputs. The scattered particles go through all sorts of detectors which measure their energy and direction and send them to the cluster, which has to search for particles significantly smaller than a needle in a haystack of measurements.

(Disclaimer: IANAPP (Particle Physicist))

Re:"But why?" asked Little Johnny. by Tumbleweed · 2003-06-14 09:33 · Score: 3, Funny

Hey, *somebody* has to back up the Internet from time to time!

Either that, or all the pr0n encoding. :)

Best...Tivo...*ever*!

Host this thing at an Internap location, and you're the Ultimate LPB.

Searching for "First Posters" for the Homeland Security people to "visit." :)

SETI client!

"Every room has every movie ever made in any language." Who do you think hosts *that*?

ILM, seeing the second LOTR movie, decides an 'upgrade' is in order for the SW:EP3 render farm.

It takes this much computing power to find WMD in Iraq. ...or to find Cheney.

MS compiling Longhorn builds.

Calculating the question for the answer 42.

BitTorrent!

pdf -- text by CowBovNeal · 2003-06-14 09:36 · Score: 1

Installing, Running and Maintaining Large Linux Clusters at CERN

Vladimir Bahyl, Benjamin Chardi, Jan van Eldik, Ulrich Fuchs, Thorsten Kleinwort, Martin Murth, Tim
Smith CERN, European Laboratory for Particle Physics, Geneva, Switzerland

Having built up Linux clusters to more than 1000 nodes over the past five years, we already have practical experience confronting some of the LHC scale computing challenges: scalability, automation, hardware diversity, security, and rolling OS
upgrades. This paper describes the tools and processes we have implemented, working in close collaboration with the EDG project [1], especially with the WP4 subtask, to improve the manageability of our clusters, in particular in the areas of system
installation, configuration, and monitoring.
In addition to the purely technical issues, providing shared interactive and batch services which can adapt to meet the diverse and changing requirements of our users is a significant challenge. We describe the developments and tuning that we have
introduced on our LSF based systems to maximise both responsiveness to users and overall system utilisation.
Finally, this paper will describe the problems we are facing in enlarging our heterogeneous Linux clusters, the progress we have made in dealing with the current issues and the steps we are taking to 'gridify' the clusters
1. INTRODUCTION
The LHC era is getting closer, and with it the challenge of installing, running and maintaining thousands of
computers in the CERN Computer Centre. In preparation, we have streamlined our facilities by
decommissioning most of the RISC hardware, and by merging the dedicated and slightly different experiment
Linux clusters into two general purpose ones (one interactive, one batch), as reported at the last CHEP[ 2].
Quite some progress has been made since then in the automation and management of clusters. The EU DataGrid
Project (EDG), and in particular the WP4 subtask[ 3], has entered its third and final year and we can already benefit
from the software for farm management being delivered by them. See [4] for further details. In addition, the LHC
Computing Grid project (LCG)[ 5] has been launched at CERN to build a practical Grid to address the computing
needs of the LHC experiments, and to build up the combined LHC Tier 0/ Tier 1 center at CERN.
In preparing for the LHC, we are already managing more than 1000 Linux nodes of diverse hardware types,
the differences arising due to the iterative acquisition cycles. In dealing with this high number of nodes, and
especially when upgrading from one release version of Linux to another, we have reached the limits of our old
tools for installation and maintenance. Development of these tools started more than ten years ago with an initial
focus on unifying the environment presented to both users and administrators across small scale RISC workstation
clusters from different vendors, each of which used a different flavour of Unix[ 6]. These tools have now been
replaced by new tools, taken either from Linux itself, like the installation tool Kickstart from RedHat Linux or the
RPM package format, or rewritten using the perspective of the EDG and LCG, to address large scale farms using just
one operating system: Linux.
This paper will describe these tools in more detail and their contribution to the progress in improving the
installation and manageability of our clusters. In addition, we will describe improvements in the batch sharing and
scheduling we have made through configuration of our batch scheduler, LSF from Platform Computing[ 7].
2. CURRENT STATE
In May last year, the Linux support Team at CERN certified RedHat Linux 7. This certification involved the
porting of experiment, commercial and administration software to the new version and verifying their correct
operation. After the certification, we set up test clusters for interactive and batch computing with this new OS. This
certification process took quite some consid

--
Bush is on fire and its not good for my lungs.

Re:pdf -- text by Anonymous Coward · 2003-06-14 09:43 · Score: 0

OMG, if the picture at that site is really cowboy neal than we should all chip in and get him a gym membership.

Re:"But why?" asked Little Johnny. by Anonymous Coward · 2003-06-14 09:37 · Score: 0

Shouldn't that be a Large Hard-on Collider-class computation?

Re:Episode 11.1: Blindsided by JrTcoNrd · 2003-06-14 09:40 · Score: 0

I write better stuff than that... and i am a middle schooler, going into high school. Work on i dunn, a plot?

--
Do you ever find yourself humming the MacGuyver theme song? Then you my friend, are a true nerd.

Re:"But why?" asked Little Johnny. by digidave · 2003-06-14 09:45 · Score: 1, Funny

They must be trying to play Doom 3.

--
The global economy is a great thing until you feel it locally.

Too little, too late. by fuchsiawonder · 2003-06-14 09:46 · Score: 2, Funny

Just a little too late for the SETI@home project. Kind of a shame, really. If only we had those computers sooner...

Re:Too little, too late. by AndroidCat · 2003-06-14 10:11 · Score: 1

There are other projects that could use a lot of spare CPU time. If the humanitarian ones don't excite, how about pissing off Bill Gates? (Click on my sig.)

--
One line blog. I hear that they're called Twitters now.
Re:Too little, too late. by samhalliday · 2003-06-14 10:26 · Score: 1

i assume by "those computers" you mean the 1000 at CERN (not that they'd actually waste their time processing SETI data with the shit load of aliroot stuff going on these days...) well, CERN have been running GNU/Linux clusters for a looong time now, so this is no new thing. In fact, my friend actually had one of the older dual intel 500Mhz machines as his desktop machine, ripped out from the last generation cluster. they basically led him into the buzzing cluster room and said "grab one and follow me"... :-D
Re:Too little, too late. by Anonymous Coward · 2003-06-14 10:32 · Score: 0

Why is it a shame? I heard that SETI had too little data for too many machines anyway, so there were sending out duplicate blocks. How would more machines have helped anything?
Re:Too little, too late. by aled · 2003-06-14 13:08 · Score: 1

if we had lots of these cheap processors before... when they were expensive...

--

"I think this line is mostly filler"

Re:"But why?" asked Little Johnny. by Anonymous Coward · 2003-06-14 09:46 · Score: 0

[[ Just because you don't need it, or can't envision needing it, doesn't mean nobody else needs that kind of power. ]]

This is exactly why he asked the question, genius. He wanted to know who would need this type of computational power and if it was more cost/performance effective than just buying a supercomputer.

These typical Slashdot dumbshits need to get off their "I'm smarter than you" pedestals and realize that they're no better than anyone else. It would also be nice bonus if they learned how to read.

Re:"But why?" asked Little Johnny. by Anonymous Coward · 2003-06-14 09:49 · Score: 0

The rule of Marc: Whenever commenting on someone else's stupidity, you will always indirectly comment on your own.

Damn you, grammer!

ClusterKnoppix - OpenMosix by Anonymous Coward · 2003-06-14 09:49 · Score: 5, Interesting

I've been looking at ClusterKnoppix mentioned recently on slashdot. It has built in openmosix and also supports thin clients via a terminal service. Just pop it in, and instant cluster. In case you missed the article:

ClusterKnoppix

Re:ClusterKnoppix - OpenMosix by paradxum · 2003-06-14 12:40 · Score: 1

I've been working on a redistro of ClusterKnoppix designed for video encoding... and it's comming along.... Just a few more deps to rebuild. It Makes DIVX encoding bearable.

I was building my own setup simular to knoppix until I discovered ClusterKnoppix.... I love when someone else does my work for me :)

Single system image by Tester · 2003-06-14 09:56 · Score: 5, Informative

Where I work, we are developping a clustering system using single system images.. Where all the OS is stored on a server and is NFS mounted by each node. Our current tests show that we can easily run 100 nodes on 100mbit ethernet from a single server... And the coolest thing is that the nodes mount the / of the server, so for "small clusters" (under 100 nodes), we have to do a software upgrade only once and all nodes and the server are upgraded... Btw, this whole thing can be done using an almost unmodified Gentoo Linux distribution.

I'm hoping to convince my boss to let us publish detailed docs.. he thinks that if we do everyone will be able to use it and he will loose sales (we are in the hardware business..). Details at our homepage and about an older version (but with more details) at the place where we used to work.

Re:Single system image by Anonymous Coward · 2003-06-14 10:40 · Score: 0

We run over 200 diskless linux boxes from a single server(dual 2.8 Xeons) with a 100 MBit NFS/admin network(The computation network is fully bisectional Myrinet)
Re:Single system image by gregsv · 2003-06-14 11:05 · Score: 1

Another way (which happens to be the way we do it where I work) is to make a master OS image, store it on a central server, and rsync it down to / on every node. Updates are made to the master OS image and then get automatically propagated down to every node. When new or replacement nodes are deployed, we use RedHat's KickStart system to install a base OS on them, then rsync down the master image. We maintain over 700 nodes this way.
Re:Single system image by Anonymous Coward · 2003-06-14 11:35 · Score: 0

If your interested in this sort of thing take a look at systemimager, it can multicast out images to hundreds (maybe thousands???) of nodes, this means the installation has a complexity of approximatley O(1) (sometimes data gets corrupted and individual nodes require the image to be retransmitted). If the software setup on a node gets screwed up it only takes a few minutes to reinstall the node. Of couse producing the images isn't quite so easy.
Re:Single system image by Anonymous Coward · 2003-06-14 12:07 · Score: 0

Yeah, this sounds very close what I set up! (See post o Autoassimilating Linux Cluster) Cool! I didn't have fully distros on the nodes though, no where near. Since they were just openMosixing at first, it was not necessary. LAM/MPI usage might require it, but then I was just thinking of /usr over NFS. Well, I can release my info and will be, I hope you can too, it'll be neat to see what you did.

theodiggers

Re:"But why?" asked Little Johnny. by CowBovNeal · 2003-06-14 09:56 · Score: 2, Funny

So that they can survive a slashdotting? ;)

--
Bush is on fire and its not good for my lungs.

why such a huge cluster? by Anonymous Coward · 2003-06-14 09:57 · Score: 5, Interesting

well, i recently interviewed at nvidia, and they have a 3,000+ cluster just for emulating the new graphics/io chips they're working on... they don't manufacture anything, the turn around time to manufacture a prototype for testing would take too long... so all they do is simulate the actual chips and then send the data off for fabrication once they're done. on a cluster of 3,000 machines, some jobs take all weekend, from what i understand.

imagine if they just used one machine.

Re:why such a huge cluster? by Ziviyr · 2003-06-14 14:04 · Score: 2, Funny

Why imagine? I got a calculator...

16 years, 156 days, 3 hours

Athlons would be putting out better graphics on their own that far into the future. :-)

--

Someone set us up the bomb, so shine we are!
Re:why such a huge cluster? by FLoWCTRL · 2003-06-14 19:02 · Score: 1

Oooh.. all weekend. I recently attended a talk in applied math where a researcher presented results of a simulation that ran on 47 CPUs for two years to reveal many hitherto unknown facts about hydrogen bonding...

--
http://oss.netmojo.ca

Related project: Loading disk images for clusters by angio · 2003-06-14 09:57 · Score: 4, Informative

This reminds me of a paoper that was just presented at USENIX:
Fast, Scalable Disk Imaging with Frisbee. Fun talk.

Pretty cool tricks - they use multicast and filesystem specific compression techniques to parallel load the disks on a subset of the disks in the cluster. Very very very fast. (I use the disk imaging part of their software to load images on my test machines at MIT, and I'm quite impressed).

Anyway, just a bit of related cool stuff.

Re:Episode 11.1: Blindsided by serrasalmus · 2003-06-14 10:00 · Score: 1

Have they not taught you the meaning of the word "satire" in middle school yet? That is unfortunate, as I was taught what "satire" was at the tender age of five. Allow me to initiate your uninitiated minds: Main Entry: satÂire Pronunciation: 'sa-"tIr Function: noun Etymology: Middle French or Latin; Middle French, from Latin satura, satira, perhaps from (lanx) satura dish of mixed ingredients, from feminine of satur well-fed; akin to Latin satis enough -- more at SAD Date: 1501 1 : a literary work holding up human vices and follies to ridicule or scorn 2 : trenchant wit, irony, or sarcasm used to expose and discredit vice or folly

Re:"But why?" asked Little Johnny. by Alinabi · 2003-06-14 10:04 · Score: 1

Pretty much everything that has to do with solving evolution equations for complex systems. Even wether forcasts require way more computing power than NASA's 96 node cluster can provide. Rocket science is not at all "rocket science".

--
"You can't allow somebody to commit the crime before you detain them." [Condoleezza Rice]

Red Hat 7.3 by Spoticus · 2003-06-14 10:05 · Score: 2, Informative

RH 7.3 reaches it's end of life in December of this year. One can only assume (and hope) that they have the in-house people to support it, or it's going to cost them beacoup $$ for continued RHN support.

Re:Red Hat 7.3 by vondo · 2003-06-14 10:22 · Score: 2, Informative

I'm sure they are firewalled/NATed off, so why would they need (or even want) to upgrade that often?
Re:Red Hat 7.3 by samhalliday · 2003-06-14 10:58 · Score: 1

youve gotta be kiddin me... most of the GNU/Linux operating system is written in-house at CERN. the only reason they use redhat is so they can tell other institutions which distro to install in order to be binary compatible and sure of sources compiling successfully. im actually surprised they haven't made their own distro... i remember hearing the arguments against it once, but the memory has faded.
Re:Red Hat 7.3 by Dave114 · 2003-06-14 12:57 · Score: 1

Actually they do have their own distro... CERN Linux. It's essentially Redhat with a few modifications.
Re:Red Hat 7.3 by samhalliday · 2003-06-15 01:50 · Score: 1

interresting, i know a few people at CERN and none of them use this
maybe its so close to redhat that they burn the CD's with "redhat" written on it.
anyway, thanks for the link, if i had mod points i give you +1 informative

IN SOVIET RUSSIA....... by Anonymous Coward · 2003-06-14 10:09 · Score: 0

Large linux clusters maintain YOU!

Re:"But why?" asked Little Johnny. by bpfinn · 2003-06-14 10:12 · Score: 1

Why on earth would someone need a 1000+ node cluster?

The Atlas Project at CERN, when it comes online, is supposed to produce a petabyte of data every year. I doubt one 1000 node cluster would be enough to process that data quickly.

Re:"But why?" asked Little Johnny. by vondo · 2003-06-14 10:15 · Score: 5, Interesting

Disclaimer: IAAPP (I am a particle physicist).

First, as another poster pointed out, these detectors produce a LOT of data. I'm on an experiment slated to take data at about the same time as the LHC experiments, with similar rate requirements.

We plan to use a 2500 node cluster (of year 2007 CPUs) to filter our data in real time. The input rate into this cluster will be about 10 GB/s, output rate about 200 MB/s.

But, each interaction is analyzed (usually) by just one computer. There are so many interactions, though, that you need massive clusters, but not much communication between nodes of the cluster.

That's just for the data filter. You need even larger amounts of computing to analyze what comes out in that 200 MB/s and to simulate what happens in the experiment. Much larger amounts.

Our experiment will ultimately require clusters this size at the laboratory and at something like a dozen other institutions.

Only on Slashdot... by pi_rules · 2003-06-14 10:19 · Score: 1

(Disclaimer: IANAPP (Particle Physicist))

Gotta love Slashdot... the only place where such a disclaimer isn't taken for granted.

Large _Hardon_ Collider? by Anonymous Coward · 2003-06-14 10:20 · Score: 0

kind of misread the title... oops :-o

What the hell are they studying???

Re:Large _Hardon_ Collider? by Second_Derivative · 2003-06-14 10:45 · Score: 1

Oh my god...

OK I just laughed so hard at that the people around me gave me weird looks. Rare that you see something funny on slashdot these days as opposed to "rofl tacos spelling sux"
Re:Large _Hardon_ Collider? by Anonymous Coward · 2003-06-14 21:37 · Score: 0

go fuck a donkey

question from a psuedo-geek... by joebeone · 2003-06-14 10:24 · Score: 1

So, to all those who are in the know out there... when they have what they want how many nodes and individual machines could they maintain? What are the constraints? What about data back-ups? Is ephemeral data recorded on a few machines in separate nodes to make sure that one getting nocked out doesn't zap something for good?

Re:question from a psuedo-geek... by vondo · 2003-06-14 10:48 · Score: 1

Well, in particle physics, the typical use is that data isn't stored on these systems longer than it takes to analyse it (and since data is constantly being accumulated, you don't worry about small losses).
But, there are people looking into parallel, redundant filesystems and the like so that you can keep more on disk. For instance 1000x60GB=60TB is a sizable amount of free space on these clusters, but the output datarate from these experiments is a petabyte/year or so.
Re:question from a psuedo-geek... by Anonymous Coward · 2003-06-14 11:38 · Score: 0

In many clusters you don't use local disks to store data but remote NFS (in smaller systems) or data on a cluster file system like lustre or global FS. To do a backup you only have to backup the data stored on your storage nodes.

"securely installing over the network" by ameoba · 2003-06-14 10:25 · Score: 4, Interesting

Who in their right mind would have a cluster this size, for this sort of work, on any network where "securely installing over the network" is an issue? I mean, I'd want this as far off of a public network as possible, unless I really want to explain to whoever authorized my grant why my experimental data indicates that:

e = mc^31337

--
my sig's at the bottom of the page.

Re:"securely installing over the network" by samhalliday · 2003-06-14 10:36 · Score: 3, Interesting

if you read the paper (which OK is not as bad as not reading the article), you would realise that this is not a project which is being performed only at CERN; when LHC (and others, eg ALICE) become active in a few years, the data is going to be piped to literally hundreds of participating instututions (this is the current list for one of the smaller experiments) for data analysis. so, no, this is not enough processing power, and yes they need it to be publically available. i also know people who are (or were?) working on the security implementations. believe me, at CERN, they think it through; its run by lots of really smart people who know what they are at, not politicians. the distributed processing that comes out of these projects will hopefully pave the way forward for the next generation of the internet (the grid).
Re:"securely installing over the network" by Anonymous Coward · 2003-06-14 10:36 · Score: 0

You need to think in terms security in depth.

System installation over the network is a very useful mechanism for supporting survivable systems practices. System installation, of whatever kind, is also an attractive target for exploits, if steps are not taken to secure the installation process. Therefore, competent system architects think about secure installation as a matter of course.

None of this implies that the systems in question would be exposed on a public network. The installation mechanisms may be secure, but there's no value in asking for trouble!
Re:"securely installing over the network" by FLoWCTRL · 2003-06-14 18:57 · Score: 2, Insightful

Perhaps the nodes are not all physically located in the same building, or are otherwise vulnerable to physical man-in-the-middle intrusions. If one adopts secure practices as a matter of principle, it saves having to go back and implement security as an afterthought someday when the situation changes in an unanticipated way.

--
http://oss.netmojo.ca

Re:"But why?" asked Little Johnny. by Anonymous Coward · 2003-06-14 10:28 · Score: 0

You might be able to analyze your data on a 0-node cluster if the Tevatron doesn't start working better soon...

-a mildly disgruntled CDF postdoc

Re:"But why?" asked Little Johnny. by huraxprax · 2003-06-14 10:30 · Score: 1

Good explanation. The main cause why LHC needs so much processing power is that the higher the energy, the more scattered particles ("Jets") you have, and they all arrive instantly at the detectors, and LHC will have higher energies than its predecessors. The "size" of particles is meaningless, but the interesting events where a new particle can be detected are very rare. I can't remember the numbers anymore and would have too look it up. They are also working on custum hardware which will do some calculations before sending the data to the cluster.

Re:Episode 11.1: Blindsided by JrTcoNrd · 2003-06-14 10:37 · Score: 0

well, whatever that was supposed to be... i got it, but it lacked effectivness. Sorry, but when you post a rebuttal, make sure that it's partially coherent, at least!

--
Do you ever find yourself humming the MacGuyver theme song? Then you my friend, are a true nerd.

Re:Why? by Anonymous Coward · 2003-06-14 10:42 · Score: 0

Is it that hard for you to admit that the US is falling behind the curve?

Re:Episode 11.1: Blindsided by serrasalmus · 2003-06-14 10:45 · Score: 1

Apparantly you didn't, because you didn't grasp any of my points whatsoever. A dense mind will never pick up on any cues...... no matter what..... and you sir, appear to be quite dense.

Re:Why? by samhalliday · 2003-06-14 10:54 · Score: 0, Flamebait

CERN is only half in France, the other half being in switzerland (not even in the Europe Union). but, being American it must be hard for you to understand geography beyond your own backyard; my deepest regrets :-/

Re:Why? by hasse · 2003-06-14 11:02 · Score: 1

Well, Frenchie La Frencherson, last time I checked (right now as a matter of fact), Switzerland was located smack in the damn middle of Europe and the EU. How dumb do you think us americans are?

But does it... by arose · 2003-06-14 11:05 · Score: 4, Funny

...run Windows?

--
Analogies don't equal equalities, they are merely somewhat analogous.

Re:But does it... by blibbleblobble · 2003-06-15 00:28 · Score: 1

"But does it run windows?"

I can just see the purchase-request now... 1000 copies of Windows at $250 each. ... and a lot more keyboards, mice, and monitors.
Re:But does it... by arose · 2003-06-15 02:25 · Score: 2, Funny

You forgot thr trained monkeys.

--
Analogies don't equal equalities, they are merely somewhat analogous.

Re:Why? by samhalliday · 2003-06-14 11:11 · Score: 1

Europe is a way of talking about a geographical region... the EU is a political collection of countries, which (ever so neutral) switzerland is not a member of. so, Switerland is in Europe, but NOT in the EU. i never said otherwise :-P

you actually checked? hehe...

Re:Why? by Anonymous Coward · 2003-06-14 11:12 · Score: 0

Pretty dumb, since you cannot grasp the fact that geographical location has nothing to do with membership in a political organisation. To make it less abstract, and hopefully easier to understand for you, think of how West Berlin was not part of the soviet block, in spite of the fact that it was located smack in the middle of Est Germany. But maybe i'm asking to much of you.

Re:Why? by Anonymous Coward · 2003-06-14 11:16 · Score: 0

How dumb do you think us americans are?

Two more answers like that and you'll make me believe that the US is inhabited by amoebas that can type on a keyboard.

1000+ cluster? by jabbadabbadoo · 2003-06-14 11:19 · Score: 1

Running rpm --rebuilddb must be a real drag.

Re:Why? by Hank+Chinaski · 2003-06-14 11:23 · Score: 1

obvious troll. and not funny. cern is in switzerland.

--
IAAL

Another approach... by Junta · 2003-06-14 11:39 · Score: 2, Informative

If you want to scale more, and your nodes have tons of ram, you could likely stuff the whole os into ramdisk and then use the local disk for the scratch space. Once booted, the network impact of nfs goes away.

Of course, you could use System installer Suite (http://www.sisuite.org/) which is *similar* to the rsync method mentioned by the other poster, but you get to skip the redhat install step in favor of SiS's tools.

--
XML is like violence. If it doesn't solve the problem, use more.

Re:"But why?" asked Little Johnny. by Anonymous Coward · 2003-06-14 11:45 · Score: 0

NASA have the machine at number 18 in the top500 list, its got 384 nodes (1392 cpus, 4 cpus per node).

Re:"But why?" asked Little Johnny. by C32 · 2003-06-14 11:48 · Score: 1

Funny coincidence, I was just reading an article about the planned CERN Large Hadron Collider which will be ready in 2007; it'll put out 1250 Mbyte/sec.
This is stored to tape though (~50 30 Mbyte/sec Storagetek 9.940B drives in parallel), not realtime.

Re:Episode 11.1: Blindsided by JrTcoNrd · 2003-06-14 11:54 · Score: 0

I pikced up on it, buts being a snob doesn't make you classy. Der!

--
Do you ever find yourself humming the MacGuyver theme song? Then you my friend, are a true nerd.

i'd just like everyone to know... by Anonymous Coward · 2003-06-14 11:55 · Score: 0

I LOVE LINUX!!!!!!!!!!!!

SystemImager-like update mechanism for non-Linux? by pschmied · 2003-06-14 12:28 · Score: 5, Interesting

I'm surprised that nobody has mentioned SystemImager. If you haven't looked at it for maintaining large numbers of Linux boxes, scamper off and take a look now. It is worth your time.

Now, that being said, I recently had the opportunity to evaluate using a number of OpenBSD boxes, but I couldn't find a utility for maintaining a bunch of the boxes in the same manner as SystemImager (i.e. Incrementally update servers from a golden master via rsync).

So, has anyone run found anything that does what systemimager does, but that is cross-platform? Do any SystemImager developers out there want to comment on the potential difficulty in supporting other-than-Linux operating systems in SystemImager?

SystemImager is one of the most useful tools I've ever seen, however, I believe that it would be an enterprise "killer app" if it could do MacOS X, *BSD, Windows etc.

-Peter

--
. Penguins Surely Ca

Re:Why? by Anonymous Coward · 2003-06-14 12:44 · Score: 0

It's understandable why a person would have to check. It's akin to not being able to provide exact coordinates for a specific planet/asteroid orbiting a planet a million light years away: nobody cares, and by nobody, I mean those of us who matter, namely Americans.

Huh? by soloport · 2003-06-14 13:04 · Score: 1

Anyone else read it as "Large Hardon Collider"? I blew coffee threw my nose. Damn disexlia...

Re:Huh? by pompousjerk · 2003-06-14 13:20 · Score: 1

Let me say this:

Holy crap, that would have been embarrassing...
Re:Huh? by Anonymous Coward · 2003-06-14 15:17 · Score: 0

In Soviet Russia...

Large hardons collide YOU!!

Re:"But why?" asked Little Johnny. by ch-chuck · 2003-06-14 13:26 · Score: 1

I need a cluster to do the rendering of my blender animation experiments. My 100 frame movie at 640x480 takes several minutes to finish on a single 1.3Ghz box, especially when you like enviroment maps at high res (for mirrored surfaces).

Next Q: Why whould anyone want to make 3d animations in Blender? A. Because I want to!

--
try { do() || do_not(); } catch (JediException err) { yoda(err); }

Re:"But why?" asked Little Johnny. by Anonymous Coward · 2003-06-14 13:55 · Score: 1, Interesting

I am a high energy physicist.
You will need this much computing power if you are trying to filter and analyze one the order of a petabyte of data yearly. Some collisions at the LHC will produce 1000s of particles, a large fraction of which will be detected in multiple detectors as they fly away from the collsion point (nucleus on nucleus collisions). Thousands of these collisions will happen every second. The information in the various detectors then must be collected back so that all the signals a particular particle made can be associated with each other. Then many graduate students must write code to search through all these particles for exciting physics. A lot of computing power is essential for exploiting the potential of the collider and detector.

Re:Why? by Anonymous Coward · 2003-06-14 14:35 · Score: 0

Actually no. Part of CERN is in France, part is in Switzerland.

Re:"But why?" asked Little Johnny. by pdp11e · 2003-06-14 15:10 · Score: 3, Insightful

One application that benefits from adding the nodes (with almost linear scaling in performance) is the Monte Carlo radiation transport. For example, in medical physics people try to calculate a dose distribution in a human body for the various configuration of treatment accelerators. Monte Carlo simulation software "generates" random initial particles (with appropriate probabilities for given accelerator) and than tracks each particle as it propagates and interacts with surrounding tissue. Interactions are randomly generated (hence: Monte Carlo) but again randomness is biased according to the appropriate physics. Each such "history" can be independently generated by a different node thus making parallelization trivial.
In my lab I have assembled a 24-node cluster and it takes about 4-8 hr to calculate dose distributions for the most cases. With a 1000 node cluster it would be possible to do this sort of calculations routinely in clinics during the treatment planing and actual treatment. This will mean that the cancer patients will have improved survivability odds due to the more precise targeting of the tumors.

Cheers,
Beowulf's root

FreeBSD cluster by Anonymous Coward · 2003-06-14 17:55 · Score: 0

How applicable is this to FreeBSD? Now that linux is under this legal cloud of doom, I'm switching all my clusters over to FreeBSD.

Wow... by Anonymous Coward · 2003-06-14 18:06 · Score: 0

Imagine a beow... oh wait...

NFS is a bad choice. by Anonymous Coward · 2003-06-14 22:15 · Score: 0

From my experience, NFS is by far the worst choice in networked filesystems.

Since all the boxens are linux I strongly suggest, SAMBA of NCP.

Think of it this way: In the old days we had
- NFS to share files with other old Unices (like SCO, Slowlaris, HPUX, AIX).
- SMB to share files with windowz
- NCP to share files with Novell network fs.

IMHO, NCP is the best. SMB is pretty good too.

linuxbios, anyone? by nafrikhi · 2003-06-14 23:56 · Score: 2, Informative

has anyone tried linuxbios http://www.linuxbios.org/ to replace standard bios. results in a diskless, faster boot. used in this cluster architecture: http://www.clustermatic.org/

Re:linuxbios, anyone? by pe1chl · 2003-06-15 00:34 · Score: 1

In a network, this seems to be largely redundant.
Use PXE when you want a diskless boot. May take more than 3 seconds, but is supported on many, many more systems!

1'000'000 node cluster by Anonymous Coward · 2003-06-15 01:03 · Score: 0

i have a 1'000'000 node cluster!
it crawles around, eates flies and likes light ...
and sometimes it replicates in my
fruit-loops!
it can accurately predict ( >95% ) the weather
two days ahead!

two bad it doesn't have any interface
that is compatible with me ...

it's what we scientists call a "passive-cluster"!

NFS gets the job done well by Anonymous Coward · 2003-06-15 08:09 · Score: 0

There's always an anti-NFS troll out there just waiting to spout "the truth". Get over it. NFS works great, particularily when all you are running is Linux.

Re:SystemImager-like update mechanism for non-Linu by More+Trouble · 2003-06-15 16:10 · Score: 1

SystemImager is one of the most useful tools I've ever seen, however, I believe that it would be an enterprise "killer app" if it could do MacOS X, *BSD, Windows etc.

You should check out radmind. It does in fact "do" Mac OS X, *BSD, and Linux.

:w

Re:SystemImager-like update mechanism for non-Linu by pschmied · 2003-06-16 03:35 · Score: 1

Hmm... Not quite there yet. The collection of command line tools could probably be rolled into something that automates system management the way SystemImager does. But even then, radmind rather unintelligently seems to recopy entire files.

Also, how is partitioning taken care of.

No, I'm still looking for something like SystemImager that handles multiple Operating Systems. Perhaps extending SystemImager to support others will be the easiest way.

As a side note, Frisbee, which was mentioned in a previous thread, is the killer app for LAN-based system imaging. Wow!

-Peter

--
. Penguins Surely Ca

Re:SystemImager-like update mechanism for non-Linu by More+Trouble · 2003-06-16 04:16 · Score: 1

Sorry, not a big SystemImager expert. I see that it just uses rsync, hence your comment about recopying entire files. I'd point out that for binary files, rsync tends to copy the entire file anyway, on a version change. radmind's nice in this case because it can tell that a file needs to be updated with no network traffic.

how is partitioning taken care of

Depends on the system. For Mac OS X, we pretty much need to use Apple's tools. For Solaris, we use Jumpstart. Kickstart on Linux. Partitioning is very OS specific. radmind is very portable.

:w

ar98sarf s87aeh87aw4h by jamie · 2003-07-13 12:43 · Score: 1

dsriugadniaw34r sareh98fase fasef

Re:Episode 11.1: Blindsided by Uber+Banker · 2003-11-09 08:52 · Score: 1

"I picked up on it, buts being a snob doesn't make you classy. Der!"

Hmmmmm

Slashdot Mirror

Maintaining Large Linux Clusters

134 comments