SGI and SuSE Team Up on FailSafe for Linux
Syn Ack writes, "SGI and SuSE announced at CEBIT that they are going to team up to bring Iris FailSafe to Linux. Linus is quoted as saying that this is a "piece of the puzzle" that Linux is missing. Here is SGI's press release." The press release says FailSafe for Linux will be open source, but doesn't say under what license.
This kind of redundancy and task distribution could help break linux/unix type systems more into the upper level corporate server market where Solaris currently seems to be the trend do to it's robustness.
- learn mathematics - shoot dope -
This is what all those nay-sayers of Linux have been waiting for. "But who needs FailSafe for Linux? I thought it was FAILPROOF! Isn't that why we should switch to it? I don't see Windows needing FailSafe..."
:P
BTW, for those of you who couldn't tell, that was a joke
My plan is to pimp before they realize I'm a jackass. Hit 'em hard and fast.
Redhat's piranha tools in 6.1 already allow clustering. The SGIs failsafe thing needs to work with a RAID drive array and does roughly the same thing as the linux virtual server project http://www.linuxvirtualserver.org/
SGIs stuff is available for download if anyone is interested at http://oss.sgi.com/ projects/sgilinux11/download/1.2-latest/ISO/
ok. that is a dumb question..:) to be brief :i ty.html
The high availability of virtual server can be provided by using a tool to monitor network service availability and server nodes. The "heartbeat" code currently provides the heartbeats among two node computers through serial line and UDP heartbeats. IP take-over software is provided by using of ARP spoofing. In other words,
[a] a daemon sites monitoring heartbeat packets coming from the servers [master and slaves]
[b] one master goes down, the heartbeats stop from the master.
[c] one of the other slave(s) takes over with ARP spoofing of the masters ip address.
more info :
http://www.linuxvirtualserver.org/HighAvailabil
Not really. The classic DDoS attack ( AKA what took down Yahoo ) simply has no defiance. Apart from perhaps having separate sites under different names with the same Data onboard.
What most people miss about those DDoS attacks is that they didn't actually overload the servers. On the contrary, they loaded down the pipes so much that the servers sat idle for hours.
See cringly's latest rant for more details ( not from him but a letter writer ). The only practical protection is to secure machines to prevent them becoming zombies in someone else's DDoS army.
--= Isn't it surprising how badly I spell ?
This will do wonders for Linux Availability and Scalebility.
That however is just touching on the obvious part. Less obvious is that this will let stuff written for Linux scale to the upper limits of business computing in very short order.
How is that you ask ?
Linux has been ported to the IBM Mainframe in such a way that a single s390 Box can run thousands of copies of Linux each doing dedicated tasks. The resources available to each can be adjusted to whatever the Kernel supports ( I.e. 64 GB or RAM, 2 TB of Storage etc... ) or what's needed for that particular operation ( 1 mips on the image server, 3 on the Web servers and 20 on the Database server ).
Add this in and you start to see a really terrifying scenario where Linux is able to scale to tomorrow's web service tasks and very little else can. By that I mean when Computing takes off in the 3rd world the way TV has. When Bandwidth becomes cheaper and more abundant. You are talking about 20 Million 800x600 two way Videophone conversations at the same time.
Nobody has the horsepower to play traffic cop in that situation now. But a Linux mainframe scaled to beyond today's limit will. Being Linux will simplify the development process since the developers can all have it on the desktop too.
As for the Licensing. SGI isn't completely cluless. They put XFS under the GPL to get it into the Kernel and avoid a fuss. This ccNUMA stuff will be at least partially in Kernel space so you can once again expect it to be GPLed.
--= Isn't it surprising how badly I spell ?
Perhaps I've missed something, but the High-Availability Linux Project (http://www.linux-ha.org) already has similiar goals for clustering and failover.
Wouldn't it be better to put more community effort into a "real" OpenSource (GPL'ed) solution instead of trying to port Irix's existing product and possibly getting a half-baked license?
A staff member of the SuSE team told me that the source for IRIS FailSafe will be GPL'ed. And if you take a look at http://www.heise.de/newsticker/data/odi-26.02.00-0 01/ you will notice that the c't magazine writes the same, so that this info has a high probability...
You found a sword: +4 damage, +5 moderator points
There is such a thing. It's called FSV. Check out http://fox.mit.edu/skunk/soft/fsv/.
the good ground has been paved over by suicidal maniacs
SGI is currently working on a port. Check out the information at sgi's oss site. e;
Most SGI engineers are willing to "lend" you an
SGI 5.3 IDO CD if you ask them in the right way and make the point that it's for a non commercial and hobby system. If you spent less time bashing SGI and approached them in the right way you might get somewhere!
Although to some degree you have a point, since IRIX 5.3 is now obsolete and only useful on older
(MIPS R3K) hardware they should just release it for free download.
The capabilities of SGI's stuff aren't in any of the current Linux offerings. It has complete N-node cluster quorum, application monitoring and failover-restart capabilities. It also has the nice GUI that is necessary to make it look real. This is completely comparable to the Win NT/2k Microsoft Cluster Services (MSCS).
The people who have made whinny comments here really don't get it. It would have taken a year or 18 months for the community to come up with something flakey that would approach the capabilities that have just been dropped in our laps by the grace of [deity of your choice]. Adoption and exploitation of Linux/Failsafe, and getting it all going on IA64 this year is critical to smacking Redmond around while they fumble with Win2k.
I would hope more and more companies with locked in proprietary software would release it like SGI, making it usefull and acceptable to people who won't go down the proprietary road. We could still use some better storage solutions.
I was just visiting the SGI booth here at Cebit and I must say that I am very impressed with those guys. I was talking to one of the engineers that have worked on the XFS port to linux, and it was interesting to hear the "Engineers" point of view on the entire release scenerio of XFS into the GPL
They previewed for me the XFS actually working on one of there linux boxes running at the show.. (I must say, the new rack mount cases they have are SOOO sexy!!)
But most importantly , I spent a bit of time talking to the engineers and I was very impressed with how they want to help the community. I felt like they where members of the community themselves, just getting paid for it.
Has anyone had a chance to see the new Octane product they have under a NDA? (I am going to sign it just to get into the "Closed doors" and play with it...)
a) SGI can do nothing right: so I guess switching to Linux is wrong? Making a very high percentage of the machines on the Top500 list is wrong? Um, ok...
b) They have crappy unscalable hardware: So I guess Onyx II Infinite Reality Graphics are crappy? Hate to break it to you, but, while they may be a bit pricey, there ain't nothin' much faster. As for unscalable, 512 processors isn't scalable? Please. I run Irix on a machine with 512 processors and 196 gigs of RAM. Can Linux do that? Other than Cray, and Intel's one-off for ASCI, does anyone make anything bigger? Granted we (I work for SGI, in case you couldn't tell) are selling Cray, but the T3E has been sold in configurations of 1800 processors and the architecture scales even further. I think that qualifies as "scalable".
c) Inferior OS with no features Linux doesn't have: Pass me whatever *you're* smoking please. How about a journaling file system that is production ready? Scalable to 512 processors? ccNUMA support? Runs Alias|Wavefront applications that produce probably at least half the special effects you see on TV/movies? I'm sure there are more, but I don't feel like coming up with them. Now don't get me wrong - Linux is a great OS, but that doesn't mean that Linux lacks no feature found in Irix.
d) Public commercial company: RedHat. VA Reasearch. Need I go on?
e) Secret motives to steal the genious from Linux: And just how would we do that even if we wanted to??? All we would succeed in doing is getting everyone upset with us and ending up with a propietary version of Linux. Where, exactly, would that get us besides bankruptcy court? If it were up to us, we'd probably insert massive scalability features into Linux like, say, support for 512 processor SSI's. But, the Linux community would never accept those changes so we simply won't make them until the community will. Trust me, SGI is far more interested in playing by the rules than I'll bet most "Linux companies" out there.
If you want a company that keeps mumbling about contributions to Linux/Open Source and doesn't deliver, think of Sun, not SGI.
Go Badgers! -- #include "std/disclaimer.h"
"IRIS FailSafe runs in a cluster environment"
.....
OK, this I can appreciate. But the next sentence makes me wonder how useful it will be :
"In the event of a failure IRIS FailSafe automatically fails over applications from one system in the cluster to the other."
so if i understand correctly, if an application fails, Iris makes sure that the failure is spread out over the whole cluster. Distributed failing ? Interesting approach
CJM
Remember, public companies have a legal obligation to make money. This means that they will act with (enlightened?) self-interest. This means most companies will at different times act in ways which are good or bad from our point of view. It's not like with humans, where personality comes into it. All companies have the same selfish personality, just reacting differently because they are in different situations.
perl -e 'fork||print for split//,"hahahaha"'
SGI wants you to think they actually care about opensource but really they don't.
Hmm. That'll explain why they contributed a journaling filesystem to the Linux kernel under the GPL, then.
I personally think that the Linux community is aiming too low here. High Availability failover services are just about to become yesterdays technology. Take a look at where Compaq are taking their Tru64 Unix clustering.
/usr for all systems (each cluster member has its own root and swap, also on shared storage).
/etc/{memb}/blah/... where {memb} is mapped to the cluster member ID. From a members perspective the filesystem structure adheres to tradition, when in reality system specific parts of the filesystem are held independantly.
... A cluster "system" disk, containing a common
... Cluster Common Filesystem. All filesystems mounted on any cluster member appear in the mount tables on all systems. Even filesystems on private buses (eg: CD-ROM's)
... Context Dependant Symbolic links, eg:
... Install the OS once and the Cluster software once. Adding new cluster members (out of the box, with no installed OS) takes only 10 minutes.
... Install an application only once and all members can run it.
... Cluster member numbers factored into PID numbers (init is no longer PID 1) creating unique cluster wide PID's. Helps in cluster process management, but more importantly, paves the way for future advances in "process" failover between cluster members. IMHO this is the holy grail for future cluster technology.
... DLM (distributed lock manager) out of the box. Applications like Oracle Parallel Service should be a lot easier to build, run an maintain in future.
There are a good number of other features, but this is enough to get the point across. There is a big difference between what is "called" clustering in the UNIX world right now (which is not much more than fast hot standby failover) and what clustering was meant to be. VMS has had it for years. Compaq's Tru64 UNIX is on the cusp of getting it (first production quality release is TruCluster v5.0a, due I believe within a month or two).
THIS is what Linux Clustering needs to be aiming for. Not playing catch up with existing failover technology, because that will soon go the way of the dinosaurs.
Macka
No.
:) The people who maintain Mosix call it a "fork and forget" cluster, because basically what it does is to distribute processes between nodes. It's not as special-purpose as Beowulf, and doesn't need to have things specially coded/compiled for it to work (of course, Beowulf will likely get better performance, IF you take the time to tailor your app to it, and if your app was "embarassingly parallel" to begin with).
:)
:) Maybe I'll try this SGI thing when it comes out, too; can't look bad on a resume...
Mosix is a clustering technology which is more similar to--yes, your favorite--Beowulf. Except that Mosix is basically, as my friend puts it, "SMP Writ Large"
This is more of a failover technology, e.g. it's not really a "cluster" in the sense you're thinking. It's more than 1 machine, yes, but they're there to provide high availability. Basically, if one machine goes down, another will take over for it.
You can get something similar by going here:
http://linuxvirtualserver.org
They have patches and instructions for setting up a nifty webserver HA cluster, which makes use of apps like mon, heartbeat, and fake (at least 2 of which are Debian packages, which makes my life easier
I'm now building a cluster out of low-end machines, and I'm going to try to run both Mosix AND VirtualServer
WMBC freeform/independent online radio.
There are several ways to do this:
:-)
Hire those who are responsible for other alternatives (that's what SuSE did with me and a few others)
Produce a superior product sooner, and put it out under the right license terms. Work on including the important industry players. We're working on this strategy now...
Now, this is not to say that alternatives are bad, because the Next Great Breakthrough couldn't happen without alternatives.
Linux should acquire HA/FT, no doubt, but Linux should have ONE VERY ROBUST HA/FT and not three or four or five not-very-much-useful HA/FT.
Kinda silly suggestion, really. No offense intended, but this SGI solution is not a "not-very-much-useful" solution, it's a tried and proven solution.
There are many routes to take to a HA system, and merging them all into one is going to a) stifle individual development (since a lot of open-source projects are for the developers to develop as well as the code), b) limit our choices and c) I don't really have time to come up with a "c)", but just an "A)" and a "B)" would look silly.
"THREE" different implementations is a) not an outrageous number, b) not even beginning to reveal the real number of options when you call into play the hardware and other software solutions for a HA system and c) I've got that "c)" problem again.
Linux is no longer an infant, but it's still too early to start cutting off its options as it works its way into adolescence. Give it time to experiment. There's room for a lot of projects.