Slashdot Mirror


SGI and SuSE Team Up on FailSafe for Linux

Syn Ack writes, "SGI and SuSE announced at CEBIT that they are going to team up to bring Iris FailSafe to Linux. Linus is quoted as saying that this is a "piece of the puzzle" that Linux is missing. Here is SGI's press release." The press release says FailSafe for Linux will be open source, but doesn't say under what license.

27 of 111 comments (clear)

  1. Linux/Solaris by fabjep · · Score: 2

    This kind of redundancy and task distribution could help break linux/unix type systems more into the upper level corporate server market where Solaris currently seems to be the trend do to it's robustness.

    --
    - learn mathematics - shoot dope -
  2. Uh Oh... by cronio · · Score: 2

    This is what all those nay-sayers of Linux have been waiting for. "But who needs FailSafe for Linux? I thought it was FAILPROOF! Isn't that why we should switch to it? I don't see Windows needing FailSafe..."

    BTW, for those of you who couldn't tell, that was a joke :P

    --


    My plan is to pimp before they realize I'm a jackass. Hit 'em hard and fast.
  3. bah. by Zurk · · Score: 2

    Redhat's piranha tools in 6.1 already allow clustering. The SGIs failsafe thing needs to work with a RAID drive array and does roughly the same thing as the linux virtual server project http://www.linuxvirtualserver.org/
    SGIs stuff is available for download if anyone is interested at http://oss.sgi.com/ projects/sgilinux11/download/1.2-latest/ISO/

    1. Re:bah. by Syn+Ack · · Score: 3

      Piranha aka LVS is NOT the same thing as FailSafe. LVS is more like a Cisco local director. FailSafe or MC/ServiceGuard (HP-UX) is for protecting applications like Oracle where LVS is more for network services like Web and SMTP/POP servers.

      I specialize in High Availability for a consulting firm here in Toronto so I am as close to an expert on these topics as you can get. I use MC/ServiceGuard when protecting databases, backup programs or anything that isn't network based. I use hardware load balancers like ArrowPoint, Big/IP, or Cisco LocalDirector when I have to cluster and load balance Web servers or mail servers.

      If you had read the information about failsafe you would have figured this out.

      It pays to inform yourself before opening your mouth.

      :)

      Paul
      ---
      Syn Ack.

  4. Re:A dumb question by Zurk · · Score: 2

    ok. that is a dumb question..:) to be brief :
    The high availability of virtual server can be provided by using a tool to monitor network service availability and server nodes. The "heartbeat" code currently provides the heartbeats among two node computers through serial line and UDP heartbeats. IP take-over software is provided by using of ARP spoofing. In other words,
    [a] a daemon sites monitoring heartbeat packets coming from the servers [master and slaves]
    [b] one master goes down, the heartbeats stop from the master.
    [c] one of the other slave(s) takes over with ARP spoofing of the masters ip address.
    more info :
    http://www.linuxvirtualserver.org/HighAvailabili ty.html

  5. NO. (Re:DoS Protection?) by Forge · · Score: 2
    "Could this sort of thing be used as protection against Denial of Service attacks?"

    Not really. The classic DDoS attack ( AKA what took down Yahoo ) simply has no defiance. Apart from perhaps having separate sites under different names with the same Data onboard.

    What most people miss about those DDoS attacks is that they didn't actually overload the servers. On the contrary, they loaded down the pipes so much that the servers sat idle for hours.

    See cringly's latest rant for more details ( not from him but a letter writer ). The only practical protection is to secure machines to prevent them becoming zombies in someone else's DDoS army.

    --
    --= Isn't it surprising how badly I spell ?
  6. This will do wonders by Forge · · Score: 3

    This will do wonders for Linux Availability and Scalebility.

    That however is just touching on the obvious part. Less obvious is that this will let stuff written for Linux scale to the upper limits of business computing in very short order.

    How is that you ask ?

    Linux has been ported to the IBM Mainframe in such a way that a single s390 Box can run thousands of copies of Linux each doing dedicated tasks. The resources available to each can be adjusted to whatever the Kernel supports ( I.e. 64 GB or RAM, 2 TB of Storage etc... ) or what's needed for that particular operation ( 1 mips on the image server, 3 on the Web servers and 20 on the Database server ).

    Add this in and you start to see a really terrifying scenario where Linux is able to scale to tomorrow's web service tasks and very little else can. By that I mean when Computing takes off in the 3rd world the way TV has. When Bandwidth becomes cheaper and more abundant. You are talking about 20 Million 800x600 two way Videophone conversations at the same time.

    Nobody has the horsepower to play traffic cop in that situation now. But a Linux mainframe scaled to beyond today's limit will. Being Linux will simplify the development process since the developers can all have it on the desktop too.

    As for the Licensing. SGI isn't completely cluless. They put XFS under the GPL to get it into the Kernel and avoid a fuss. This ccNUMA stuff will be at least partially in Kernel space so you can once again expect it to be GPLed.

    --
    --= Isn't it surprising how badly I spell ?
  7. High-Availability Linux Project by LanMan · · Score: 3

    Perhaps I've missed something, but the High-Availability Linux Project (http://www.linux-ha.org) already has similiar goals for clustering and failover.

    Wouldn't it be better to put more community effort into a "real" OpenSource (GPL'ed) solution instead of trying to port Irix's existing product and possibly getting a half-baked license?

    1. Re:High-Availability Linux Project by HeUnique · · Score: 2

      Well, since this is PORTING and not re-inventing project - then SGI brings their experience with Failsafe.

      Ofcourse, since Linux HA and FailSafe are open source - then the HA guys can grab (and look) at the source and make the HA better...

      --
      Hetz (Heunique)
    2. Re:High-Availability Linux Project by divec · · Score: 2
      Wouldn't it be better to put more communty effort into a [...] GPL'ed solution instead of trying to port Irix's existing product and possibly getting a half-baked license?

      I think SGI have been quite good about licenses generally. Their journaling file system, XFS, is released under the GPL, as is NFS 3 and probably more of their stuff. So let's wait to see what license they use here before assuming it will be the sort that Sun try to fob us off with.
      --

      perl -e 'fork||print for split//,"hahahaha"'

    3. Re:High-Availability Linux Project by Niomosy · · Score: 2

      Any companies looking at SERIOUS HA/Failover clusters are looking for 1 major thing - is it supported? They could probably care less about the source code so long as they know that at 3am on Christmas morning they can call tech support and get someone on the phone that might have a clue as to what's going on. As long as companies can fork out $$ for "premier platinum gold ultra unlimited enterprise emergency super-dooper-tech-guy-living-at-your-data-center" levels of support they'll be happy. As long as it's supported well companies will jump on it. Otherwise, they'll be less inclined (though I'm sure plenty techies will push for it regardless).

      Slightly off topic... I recall Veritas announcing they were porting software to Linux and I'm hoping their HA software goes too. It was pretty good stuff. I'd be happy running VxVM, VxFS and FirstWatch (or whatever they're calling it now) on my Linux hardware.

    4. Re:High-Availability Linux Project by alanr · · Score: 2

      I'm one of the key contributors to the Linux-HA project, and the owner of the domain name linux-ha.org, etc. I've recently joined SuSE to help them bring this to market. It's way ahead of where we could be on our own.

      I think we'll do well, and expect to get this out much faster than we possibly could starting from scratch. We expect to use something like the Apache development model.

  8. FailSafe will be GPL'ed by DevTopics · · Score: 4

    A staff member of the SuSE team told me that the source for IRIS FailSafe will be GPL'ed. And if you take a look at http://www.heise.de/newsticker/data/odi-26.02.00-0 01/ you will notice that the c't magazine writes the same, so that this info has a high probability...

    --
    You found a sword: +4 damage, +5 moderator points
  9. Re:When will SGI open-source FSN? by ksheff · · Score: 2

    There is such a thing. It's called FSV. Check out http://fox.mit.edu/skunk/soft/fsv/.

    --
    the good ground has been paved over by suicidal maniacs
  10. Re:NUMA + Linux? by e · · Score: 2

    SGI is currently working on a port. Check out the information at sgi's oss site. e;

  11. Re:The cold hard facts about SGI by rogerbo · · Score: 2

    Most SGI engineers are willing to "lend" you an
    SGI 5.3 IDO CD if you ask them in the right way and make the point that it's for a non commercial and hobby system. If you spent less time bashing SGI and approached them in the right way you might get somewhere!

    Although to some degree you have a point, since IRIX 5.3 is now obsolete and only useful on older
    (MIPS R3K) hardware they should just release it for free download.

  12. Great news; the community can't do it fast enough. by Anonymous Coward · · Score: 3
    I hope Redhat, Caldera and Turbo leap on this bandwagon. It's consistent with the goals of the linux-ha project, and gives it a tremendous kick-start.

    The capabilities of SGI's stuff aren't in any of the current Linux offerings. It has complete N-node cluster quorum, application monitoring and failover-restart capabilities. It also has the nice GUI that is necessary to make it look real. This is completely comparable to the Win NT/2k Microsoft Cluster Services (MSCS).

    The people who have made whinny comments here really don't get it. It would have taken a year or 18 months for the community to come up with something flakey that would approach the capabilities that have just been dropped in our laps by the grace of [deity of your choice]. Adoption and exploitation of Linux/Failsafe, and getting it all going on IA64 this year is critical to smacking Redmond around while they fumble with Win2k.

    I would hope more and more companies with locked in proprietary software would release it like SGI, making it usefull and acceptable to people who won't go down the proprietary road. We could still use some better storage solutions.

  13. I just came out of the SGI booth... by Pengo · · Score: 3


    I was just visiting the SGI booth here at Cebit and I must say that I am very impressed with those guys. I was talking to one of the engineers that have worked on the XFS port to linux, and it was interesting to hear the "Engineers" point of view on the entire release scenerio of XFS into the GPL /Linux world. Aparantly SGI is working very hard right now to get all of the copyrighted code out of the XFS source. To me it sounded like it started as a great marketing decision and the engineers had to kinda clean up after them a bit. :) (Sound familiar!?) :)

    They previewed for me the XFS actually working on one of there linux boxes running at the show.. (I must say, the new rack mount cases they have are SOOO sexy!!) :)

    But most importantly , I spent a bit of time talking to the engineers and I was very impressed with how they want to help the community. I felt like they where members of the community themselves, just getting paid for it. :) I must say that any mixed feelings I had about SGI previous to now have been turned around. (Who knows, maybe thats just the power of a 15 million dollar booth!) :)

    Has anyone had a chance to see the new Octane product they have under a NDA? (I am going to sign it just to get into the "Closed doors" and play with it...)

  14. Re:bogus, SGI can shove it linux doesnt need them by fgodfrey · · Score: 4
    *sigh* I don't even know why I bother to respond to crap like this. But I will anyway...

    a) SGI can do nothing right: so I guess switching to Linux is wrong? Making a very high percentage of the machines on the Top500 list is wrong? Um, ok...

    b) They have crappy unscalable hardware: So I guess Onyx II Infinite Reality Graphics are crappy? Hate to break it to you, but, while they may be a bit pricey, there ain't nothin' much faster. As for unscalable, 512 processors isn't scalable? Please. I run Irix on a machine with 512 processors and 196 gigs of RAM. Can Linux do that? Other than Cray, and Intel's one-off for ASCI, does anyone make anything bigger? Granted we (I work for SGI, in case you couldn't tell) are selling Cray, but the T3E has been sold in configurations of 1800 processors and the architecture scales even further. I think that qualifies as "scalable".

    c) Inferior OS with no features Linux doesn't have: Pass me whatever *you're* smoking please. How about a journaling file system that is production ready? Scalable to 512 processors? ccNUMA support? Runs Alias|Wavefront applications that produce probably at least half the special effects you see on TV/movies? I'm sure there are more, but I don't feel like coming up with them. Now don't get me wrong - Linux is a great OS, but that doesn't mean that Linux lacks no feature found in Irix.

    d) Public commercial company: RedHat. VA Reasearch. Need I go on?

    e) Secret motives to steal the genious from Linux: And just how would we do that even if we wanted to??? All we would succeed in doing is getting everyone upset with us and ending up with a propietary version of Linux. Where, exactly, would that get us besides bankruptcy court? If it were up to us, we'd probably insert massive scalability features into Linux like, say, support for 512 processor SSI's. But, the Linux community would never accept those changes so we simply won't make them until the community will. Trust me, SGI is far more interested in playing by the rules than I'll bet most "Linux companies" out there.

    If you want a company that keeps mumbling about contributions to Linux/Open Source and doesn't deliver, think of Sun, not SGI.

    --
    Go Badgers! -- #include "std/disclaimer.h"
  15. has anyone actually read the Iris page ? by cjmilne · · Score: 2

    "IRIS FailSafe runs in a cluster environment"

    OK, this I can appreciate. But the next sentence makes me wonder how useful it will be :

    "In the event of a failure IRIS FailSafe automatically fails over applications from one system in the cluster to the other."

    so if i understand correctly, if an application fails, Iris makes sure that the failure is spread out over the whole cluster. Distributed failing ? Interesting approach .....

    CJM

  16. Support policies, not companies by divec · · Score: 2
    SGI wants you to think that they care about opensource, but really they don't.
    What you say about SGI does sound disturbing. However, I would urge people to "support policies, not companies". SGI are releasing some good GPLed software which will help the free software community. We should praise this and use their GPLed software. From what you've said, it sounds like they are also ripping off some of their customers. If it's true, then this behaviour should be condemned. It's not hypocritical to give both praise and condemnation to a single company for different actions. What is hypocritical is to support the bad actions of a company just because you like something else which they are doing.
    Remember, public companies have a legal obligation to make money. This means that they will act with (enlightened?) self-interest. This means most companies will at different times act in ways which are good or bad from our point of view. It's not like with humans, where personality comes into it. All companies have the same selfish personality, just reacting differently because they are in different situations.
    --

    perl -e 'fork||print for split//,"hahahaha"'

  17. Re:The cold hard facts about SGI by rodgerd · · Score: 2

    SGI wants you to think they actually care about opensource but really they don't.

    Hmm. That'll explain why they contributed a journaling filesystem to the Linux kernel under the GPL, then.

  18. Linux is aiming too low! by Macka · · Score: 2

    I personally think that the Linux community is aiming too low here. High Availability failover services are just about to become yesterdays technology. Take a look at where Compaq are taking their Tru64 Unix clustering.

    ... A cluster "system" disk, containing a common /usr for all systems (each cluster member has its own root and swap, also on shared storage).

    ... Cluster Common Filesystem. All filesystems mounted on any cluster member appear in the mount tables on all systems. Even filesystems on private buses (eg: CD-ROM's)

    ... Context Dependant Symbolic links, eg: /etc/{memb}/blah/... where {memb} is mapped to the cluster member ID. From a members perspective the filesystem structure adheres to tradition, when in reality system specific parts of the filesystem are held independantly.

    ... Install the OS once and the Cluster software once. Adding new cluster members (out of the box, with no installed OS) takes only 10 minutes.

    ... Install an application only once and all members can run it.

    ... Cluster member numbers factored into PID numbers (init is no longer PID 1) creating unique cluster wide PID's. Helps in cluster process management, but more importantly, paves the way for future advances in "process" failover between cluster members. IMHO this is the holy grail for future cluster technology.

    ... DLM (distributed lock manager) out of the box. Applications like Oracle Parallel Service should be a lot easier to build, run an maintain in future.

    There are a good number of other features, but this is enough to get the point across. There is a big difference between what is "called" clustering in the UNIX world right now (which is not much more than fast hot standby failover) and what clustering was meant to be. VMS has had it for years. Compaq's Tru64 UNIX is on the cusp of getting it (first production quality release is TruCluster v5.0a, due I believe within a month or two).

    THIS is what Linux Clustering needs to be aiming for. Not playing catch up with existing failover technology, because that will soon go the way of the dinosaurs.

    Macka

    1. Re:Linux is aiming too low! by autechre · · Score: 2

      > a cluster "system" disk...common /usr

      You can do this with Coda.

      > all filesystems mounted on any node appear to all others...

      Now that's cool :) Got me there.

      > Context Dependent Symbolic Links

      OK, don't think we have that now, but it doesn't sound incredibly hard to do...

      > Install the OS once and the cluster software once...

      Put an NFS server on one of the nodes, serving "/". When you get a new client, fire it up with Tom's rootboot, fdisk the new disk, mount the local drive and the NFS share, and cp -afr. Adjust /etc/init.d/network, chroot, LILO, reboot.

      > Install any application once, all members can run it

      As long as you have a shared /usr or /opt or whatever, that's pretty much implied (so long as all nodes are running the same kernel, C libraries, etc...which they really would be).

      > Cluster member numbers...

      It sounds like Mosix may be doing something along these lines, but I admit that I'm not entirely certain (yet). I'm also not certain about that last thing you mentioned (DLM), I just wanted to point out that some of these are doable today with Linux (some, like Coda, are not "finished"...but what ever is? :)

      --
      WMBC freeform/independent online radio.
  19. Re:Isnt this mosix? by autechre · · Score: 2

    No.

    Mosix is a clustering technology which is more similar to--yes, your favorite--Beowulf. Except that Mosix is basically, as my friend puts it, "SMP Writ Large" :) The people who maintain Mosix call it a "fork and forget" cluster, because basically what it does is to distribute processes between nodes. It's not as special-purpose as Beowulf, and doesn't need to have things specially coded/compiled for it to work (of course, Beowulf will likely get better performance, IF you take the time to tailor your app to it, and if your app was "embarassingly parallel" to begin with).

    This is more of a failover technology, e.g. it's not really a "cluster" in the sense you're thinking. It's more than 1 machine, yes, but they're there to provide high availability. Basically, if one machine goes down, another will take over for it.

    You can get something similar by going here:

    http://linuxvirtualserver.org

    They have patches and instructions for setting up a nifty webserver HA cluster, which makes use of apps like mon, heartbeat, and fake (at least 2 of which are Debian packages, which makes my life easier :)

    I'm now building a cluster out of low-end machines, and I'm going to try to run both Mosix AND VirtualServer :) Maybe I'll try this SGI thing when it comes out, too; can't look bad on a resume...

    --
    WMBC freeform/independent online radio.
  20. Re:Is it wise to have multiple HA/FT for Linux? by alanr · · Score: 2

    There are several ways to do this:

    Hire those who are responsible for other alternatives (that's what SuSE did with me and a few others) :-)

    Produce a superior product sooner, and put it out under the right license terms. Work on including the important industry players. We're working on this strategy now...

    Now, this is not to say that alternatives are bad, because the Next Great Breakthrough couldn't happen without alternatives.

  21. Re:Not necessarily by AugstWest · · Score: 2

    Linux should acquire HA/FT, no doubt, but Linux should have ONE VERY ROBUST HA/FT and not three or four or five not-very-much-useful HA/FT.

    Kinda silly suggestion, really. No offense intended, but this SGI solution is not a "not-very-much-useful" solution, it's a tried and proven solution.

    There are many routes to take to a HA system, and merging them all into one is going to a) stifle individual development (since a lot of open-source projects are for the developers to develop as well as the code), b) limit our choices and c) I don't really have time to come up with a "c)", but just an "A)" and a "B)" would look silly.

    "THREE" different implementations is a) not an outrageous number, b) not even beginning to reveal the real number of options when you call into play the hardware and other software solutions for a HA system and c) I've got that "c)" problem again.

    Linux is no longer an infant, but it's still too early to start cutting off its options as it works its way into adolescence. Give it time to experiment. There's room for a lot of projects.