How Well Does Windows Cluster?
cascadefx asks: "I work for a mid-sized mid-western university. One of our departments has started up a small Beowulf cluster research project that he hopes to grow over time. At the moment, the thing is incredibly weak... but it is running on old hardware and is basically used for dog and pony shows to get more funding and hopefully donations of higher-end systems. It runs Linux and works, it is just not anything to write home about. Here's the problem: my understanding is that an MS rep asked what it would take to get them to switch to a Microsoft cluster. Is this possible? Are there MS clusters that do what Beowulf clusters are capable of? I thought MS clusters were for load balancing, not computation... which is the hoped-for goal of this project. Can the Slashdot crowd offer some advice? If there are MS clusters, comparisons of the capabilities would be welcome." One has to only go as far as Microsoft's site to see its current attempt at clustering, but what is the real story. Have any of you had a chance to pit a Linux Beowulf cluster against one from Microsoft? How did they compare?
From what I understand from reading Win 2k Advanced Server's help section on Windows clustering, it is mostly for stability. Kind of like a massive mirror raid system. I really don't see any performance advantage if you're looking for supercomputer speeds, unless your measure performance by uptime. As a side note, what were you using for clustering? I'm currently doing a cluster using mosix for my school and it seems to be going nice. I'm just curious as to what gives the best speed performance on the linux end.
can't sleep slashdot will eat me
Well, my company clusters exchange on Win2K Advanced Server.
We run in what's called an active/active cluster. But for the most part, the machines are just sharing responsibility. In most windows clusters the other server is just sitting there waiting for the first to fail. They share two drives (on seperate hardware either fibre channel or SCSI attached) and when it fails, the other server picks up those drives. Windows writes data to those drives so when the other server picks it up it can import that data and pick up where the other server left off.
Clustering is PERFECT for fault tolerance. It is useless for most intents and purposes for load balancing. If you want load balancing you can use NLBS, but it just plain sucks, and never works right.
Mike @ The Geek Pub. Let's Make Stuff!
A beowulf cluster is not limited to Linux, it could run on top of any OS. I believe NASA did the original design work to be OS agnostic.
http://www.windowsclusters.org/projects.htm gives a list of current Windows clusters.
Finally, are you out of your tiny little mind? I wonder why M$ is so keen to help. There is no such thing as a free lunch, espically from M$.
What do you call a cluster of Windows machines
when they Blue Screen?????
A Cluster Fuck?
(if you diden't know what it ment, then you woulden't be offended)
Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.
Chances are the MS rep didn't understand MS clustering. He just knew that you had a Beowulf cluster and he wanted to sell you MS software so he figured he'd sell you a MS cluster, regardless of whether or not it would do what a Beowulf cluster could do.
However there is a server solution I saw demoed at a MS DPS I attended called Application Center. It allows you to manage your cluster and distributes workloads throughout the cluster.
Now, I'm not sure if you NEED this to take advantage of Windows 2000 clustering. The last time I worked with a MS cluster was under NT 4 and it was failover only. The load balancing was "faked" by a router that would just alternate which server the request was sent to.
(insert "yeah but MS is evil" comment here)
(insert "yeah but Linux Beowulf clusters cost less" comment here)
(insert "yeah but who wants to have to reboot your cluster all the time" comment here)
(insert "I wish the sigs were longer because that's a really good quote by Richard Feynman" comment here)
"For a successful technology, honesty must take precedence over public relations for nature cannot be fooled." -Feynman
My managers will only buy windows products as they have a site liscense with MS. They are looking into Linux a little bit because of the Terminal server w/ load balancing does not load balance and the clusterd computers do not talk to each other. The profiles on the 3 clusterd servers do not update each other at all. This was much better than the last attempt my boss did using an IBM pre configured configured box the whole cluster got a BSOD and corrupted a drive losing data for 3 days. People were not happy.
I can only hope MS's poor performance will make them switch.
Vote early. Vote often. Vote CowboyNeal.
Windows clustering works as advertised for the most part, but is expensive. Some exceptions include heavily loaded machine pulling from fiber channel arrays and NAS. Both of the network attached devices seem to have some problems. Driver issues? Don't know.
Haven't seen the reported "bsod round table" where one machine crashes, shortly followed by another and another. The problems we have seen is a single machine bsods, and the other machines in the cluster don't realize it's down.
If your already in the MS camp, it will work, it look at other solutions. I think they will be more cost effective.
"Science is about ego as much as it is about discovery and truth " - I said it, so sue me.
...with M$'s "Computational Clustering Technical Preview":
* PLAPACK package (open source software)
heh.
-JT
You can do a windows cluster thing, but it's still not as good even as Condor for Unix. All in all, I'd say to tell them to go screw themselves unless they want to give you money for a LOT more hardware as well as software, to make up for the fact that you're not going to be able to do as much with it. If MS wants to be taken seriously as a hardcore number-crunching OS, the bastards can EARN it instead of trying to bribe academics.
I've been looking at this a lot myself now, as I'm also building a cluster for use in a computational bio lab at Florida State. It certainly seems that Linux is the only way to go right now. In case anyone cares, my cluster right now is 16 nodes of:
Tyan S2460 with 2 Athlon MP1800+ processors per node
1 gig PC2100 RAM per node
20 gig 7200 RPM Maxtor HD
3Com Gigabit over copper Ethernet
low-end cheapass video and floppy, etc.
All in these really nice rack cases, with a big black 2001 monolith-esque rolling rack to shove it all around in. It cost just about $26,000 to build so far, but the plans are to expand it to as many as 512 nodes within the next year or so. Whee!
That said, with the three computers I have at my place (a p3 desktop, a celeron I use as a low grade server, and my p3 notebook) I'd love to be able to set up a cluster for encoding. Such operations will be the killer app for clustered systems IMHO.
www.lonseidman.com
While I haven't been near a Microsoft Cluster in a while, I do remember a couple of things that really stand out about them:
The number of systems able to be part of the cluster is severely limited. At the time, it was limited to 2, but I'm pretty sure that has increased to a somewhat larger single digit number.
The number of applications available to run on the cluster is just as severely limited. Again at the time, there were exactly zero applications, but I know that there is at least one (Exchange) now.
Given the limitations of what uses you can put an MS cluster to, I wouldn't bother with it in the first place.
"Suppose you were an idiot..... And suppose you were a member of Congress... But I repeate myself."
there are some (as far as i understand) very good macintosh clusters that are very easy to use and very fast. especially if nothing (significant) has been done yet, a macintosh cluster computing G4-optimized code would blow away anything else in its price range. I can't say I have ever used one of these, or any other cluster for that matter, but the genuine power and versatility of the mac tells me its gotta be good.
Check out ACME at Perdue University. It was setup by a couple grad students on the cheap and really is a model of inexpensive high-performance computing. I think they only spent a coupe grand on the whole thing with help from the school scrap yard. Some good lessons in there. Oh, and they run FreeBSD which, as it's name suggestes is FREE!!
Have a Happy.
XP barely functions on a Tualatin Pentium III. I wouldn't bring it anywhere near my P2....
/Brian
Years ago, I worked at an ISP that ran partly on Solaris, mostly on Linux. A few MS reps came in to try and get us to switch to NT. We let them go through their routine, then walked them around the operations room, telling them the capabilities of what we had, and asking if NT would match them. The response was repetetively "no". When we pressed them on a few issues, they gave in rather easily. When we asked them why you couldn't bind another IP to an ethernet card under NT without a reboot, they admitted "lazy programming."
So, take the MS reps through the operation, tell them the capabilities. Ask them if they can meet or exceed them. If they say "Yes", you're either not using the real capabilities of your Linux machines, or they're lying.
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
It's a .gov so they have to use some contract we already have with IBM hardware and the MS site liscense. The director of my department is a big MS fan (even after he upgraded to XP on his laptop and corrupted the drive). I hope to be moving to a different department tho where I can possibly run linux on my desktop pc.
Vote early. Vote often. Vote CowboyNeal.
You should check out the AC3 project at Cornell University's Theory Center, which is "home to the largest Windows-based high-performance cluster complex in the world".
There are numerous machines, such as the 256 CPU Veclocity 1, that run MPI-Pro over MyraNet(?), that was one of the 500 fastest computers.
Windows is a very viable and high performance solution for running scientific parallel application, and you should order the $8.00 evaluation kit from MS and check it out for yourself.
I've developed for some of these systems, and have been very impressed. I've worked with Linux clusters too, but only on older, weaker machines, so it would not be fair to compare the two.
(Btw. all opinions here are my own, and in no way should be construed as those of Cornell or the TC).
One thing you might want to consider is administration time, scientists, who are already annoyed that programming destracts them from their real work, might not want to devote the time and effort to learn to and administrate all those linux boxes.
Anyway, if the MS rep is very eager, he might offer you some great deals. MS is very eager to be taken seriously as an HPC option.
You're forgetting the client connection license. You buy a license for an individual client cimputer, and it has the "right" to connect to however many servers you want. However, I don't think that a cluster would fall under the CAL's anyway. The last time I did any reading on MS's server products, you didn't need licenses for servers to talk to each other. Just a CAL for each client that connected to the server. (Including clients that connect to a proxy that then connects to the server.)
"Tax preparation software eliminates errors your[SIC] may make...." From IRS home page.
What about sourcing issues as well. As I understand it, when building a cluster, the software that you run has to to either be custom or custom modified in order to take advantage of the cluster. With linux/OSS this seems like it wouldn't be too much of a problem, as (providing the requisite skill is available) you could 'simply' modify existing aplications.
However, with Windows and windows software you often do not have that option. Is the management of processes entirely handled at the OS level? It seems like that might be somewhat inefficent, as opposed to having the program handle at least part of the management. If not, are there ANY aplications that are designed for a Microsoft clustering environment?
Then, point out the scads of Beowulf clusters and Linux/Unix based systems.
Finally, inform the rep and your management that you've chosen to use the more cost effective, higher performance and standardized choice...Unix.
If management resists further, do a cost analysis. That'll convince them.
299,792,458 m/s...not just a good idea, its the law!
Galileo: "The Earth revolves around the Sun!"
Score: -1 100% Flamebait
...to ask a question that I wanted to ask as well. Granted, this topic seems a little strange, considering the Linux cluster is in place, and it seems like the kind of question which encourages a Microsoft vs. Linux world domination showdown for grandmaster of the universe. It also shows a limited business sense on the part of the poster (why change something that works well when you can't afford a replacement?).
Right now a coworker and I are looking at pricing and configuring a fault-tolerant cluster for a client who runs Windows 2000 and Exchange 2000. They're a bit paranoid, so they've decided they want a cluster. We've tried to educate them on exactly what a Microsoft cluster can and can't do, so it's difficult to understand exactly what they want (basically an entire network exactly like Microsoft's own, but for $1000).
Pricing on a two system cluster is around $50,000. Buying two copies of Exchange and Windows Advanced Server will total $20,000. Then there's the hardware costs. For our client, they've specifically requested this, so they're ready to pay.
My question to Whamo is are they really taking the Microsoft rep seriously? If they have to pay software costs for their new cluster that's going to mean two things: either buying less CPUs to add to the cluster, or not doing the project at all, because just the software will put them over budget. With Advanced Server running somewhere around $4000 that's a lot per machine when Linux costs at most $5 to burn a CD after downloading it via the university's T1/T3/etc. Whamo says "it is running on old hardware and is basically used for dog and pony shows to get more funding and hopefully donations of higher-end systems" and to me that is your answer. If you can't afford the hardware you can't afford to buy Microsoft's software...
Also, there's MOSIX as well, but I don't have much experience with MOSIX and thus cannot comment on it.
From the link to MS, it appears that although the versions of Win2k are evaluation versions, the version of Visual C++ isn't (I don't give a stuff about the rest).
If you can cope without the optimising compiler of the professional edition (which appears to make little difference which optimisation method you choose), $7.95 (+$1.95 to get it to the UK) for Visual C++ seems like a bargain.
Steve.
clustering can be accomplished in a couple of ways under windows.
1 scenario is to use the built in clustering technology in windows advanced server/datacenter. You must license each machine in the cluster, and it's not meant for distributing computing, just to provide a hot standby. Academic pricing is pretty aggressive, but the clustering only works so-so in the environments I have seen it deployed in.
the second scenario requires that you still buy windows server licenses (but not datacenter, which is much more expensive than plain server), and then use a third party clustering app like veritas cluster server, or Stonesoft's products.
I still don't think you're getting the same type of functionality as you get from beowolf, but these might be alternatives. The original poster didn't describe what kinds of apps he wants to eventually use.
Microsoft usually will you give you the os for free if you're a decent sized business. Especially if you're considering going linux instead. (And yes, it has happened to me unfortunately). Its kind of like that crack dealer telling you the first hit is for free.
Ugh. I am putting together a win2k cluster at my job, and I have their computational clustering technological preview. For the most part it's a MS marketing scam (Here build a cluster on these trial versions of win2k, and check out our awesome Visual C++. Oh and here're some old versions of the stuff you really need to build a cluster.) It's not really that great IMHO. All you really need is MPI and a bunch of windows boxes. MS likes to push the proprietary MPI Pro from MPI Software Technology.
The AC3 folks at cornell have done quite a bit with these windows clusters. I guess the parallel Matlab is pretty nifty, but there's no reason any of this stuff couldn't be done on a more mature platform.
Personally, my biggest turnoff is the fact that you need KVM switches wired up to each node...well that and the overhead of running the bloatware that is win2k. Compared to a 256 node headless linux cluster we built this just sucks. Hard.
1. Universities already get this from Microsoft all the time.
2. This is a given, except when closed source would be revealed publicly (which is also a given).
3. The very idea is ridiculous.
4. Pretty much a given as well (free, that is).
VMS clustering *IS* the best implementation. But Windows clustering is nothing like the VMS version. Microsoft got Dave Cutler to reimplement the core VMS internals, but they failed to hire the cluster and file system people from DEC.
My experience is with Windows NT 4 Server Enterprise Edition. MS chose to use a "shared nothing" implementation - which, IMHO, means they don't do clustering. There is no cluster-wide locking, software runs on one node at a time, there was a limit of two nodes, and it required a shared disk.
Microsoft can FUD anything into reality ... but ask them to give you the names of three other contacts at universities that have successfully implemented a Microsoft clustering solution who are willing to talk to you about this.
... hopefully you can get more than one person at a given university and cross-check their stories.
If someone else has 'successfully' implemented one, you can find out for how large a cluster, what classes of problems it's solving, and how long it stays up (oh, and how many times they've had someone break in). Presumably administrators of an existing installation at a university won't egregiously lie to you
Then find three success stories at universities of Beowulf clusters and similar information for them. Side-by-siding these for your management should make the point verifiably clear. Ask the Microsoft sales rep for these contacts (don't settle for grossly postprocessed 'success stories') and maybe he'll disappear.
I think the point must be made to management that there is no better proof of concept than a working implementation that matches or comes close to your needs.
I was actually involved in presales on some hardware for some clustering and which eventually ended up running on UNIX but Microsoft was involved in the early meetings. They sent along the usual contingent of sell-em-anything sales droids but there was an honest engineer with the group who said:
"It's not really accurate to refer to this arrangement of servers as a wolf "pack". A pack is an organized group with a leader working towards a defined goal using a plan with a visible, known structure. These servers just sort of hang out together. It would be more accurate to call them "wolf buddies"."
Microsoft didn't get to put their software on the solution but I did tend to put more credence in what that particular engineer would tell me about the capablities of Microsoft products.