Slashdot Mirror


Sharing a SCSI Drive Between Two Boxes Using Linux?

yppasswd asks: "I'm looking for a (cheap) solution for filesystem sharing between two linux servers and, since the target is just redundancy, I've come to the following idea: two SCSI controllers, one per machine, with different IDs (say 7 and 6) sharing the same disk. Only one of them would mount the disk, the other is just ready in case of failure. I've googled this around, and I've found many different opinions (Yes, no, perhaps, don't do it or it'll explode,...) but nobody saying 'Ok, I've tried this and here is what happened...'. Suggestions are welcome, but keep in mind that many other solutions (Fiber Channel, SSA, NFS mounts, various network filesystems) were already rejected because they were either too expensive, unreliable or not supported under Linux."

112 comments

  1. Two boxes?? by eggstasy · · Score: 2

    Why two separate boxes? Why not just RAID them?
    Call me an ignorant but I don't understand. :(

    1. Re:Two boxes?? by Wiwi+Jumbo · · Score: 2

      I'm not positive, but I think the idea is to have one set of data to redundent servers. So if Server A goes *poof* then Server B can continue useing the same data and keep going.

      I believe large systems sepererate the data to a whole seperate server and have the processing done on redundent machines.

      I.E. You connect to the website, depending on load you're passed to one of multiple webservers, the webservers connect to an internal fileserver for pages and then pass it on to you. So if one of the webservers goes down, they others can keep going without any loss of content.

      Tho in large systems they probably also have redundent data servers..

      I'm guessing he's looking for a way to do this on the small scale.

      But really, what the hell do I know?

      --
      Wiwi
      "I trust in my abilities,
      but I want more then they offer"
    2. Re:Two boxes?? by Wolfrider · · Score: 1

      Sounds weird to me. The disk itself is more likely to fail than the SCSI controller or the entire server.

      I don't see why you couldn't just dup the hardware, disk and all, and just have a hot-spare server ready to go if you need it.
      .

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    3. Re:Two boxes?? by extropalopakettle · · Score: 1

      If server A dies, server B needs to be able to pick up right where A left off. The shared drive allows shared state.

  2. You need a pigtail by bihoy · · Score: 2, Informative

    Basically it's a cable that has two cables spliced together and three connectors. They are generally used to share SCSI devices between machines in a cluster.

    1. Re:You need a pigtail by bihoy · · Score: 4, Informative

      Actually on IBM's site it's called a Y cable. They run about $250.

      External SCSI Cables for the RS/6000

  3. IEEE 1394a by Hungus · · Score: 2, Interesting

    Firewire is both inexpesnive and reliable.

    --
    Bad Panda! No Bamboo for you! In matters of importance ACs will not be responded to. Want to say something critical,OK
    1. Re:IEEE 1394a by groovemaneuver · · Score: 1

      I could be way off here, but the last time I looked into this, every bit of info I could find indicated that the IEEE1394 spec didn't have a provision for multiple systems being connected to the same storage components (i.e. hard drives, CDRW, etc).

      If I'm wrong, please correct me, as I'd really like to get something like this going. Maybe what's needed is some sort of Firewire SAN switch. Hmmm....

    2. Re:IEEE 1394a by Hungus · · Score: 1

      Every fire wire device can act as its own host, so it is entirely dependent on the drivers of the computers you are connecting. According to the spec you can do this just like you can network two computers together with firewire.

      --
      Bad Panda! No Bamboo for you! In matters of importance ACs will not be responded to. Want to say something critical,OK
    3. Re:IEEE 1394a by groovemaneuver · · Score: 1

      Is there currently support for this sort of setup in Linux/BSD?

      Personally, I have not seen a 1394 enclosure w/ more than one port with the exception of the SANCube. But this seems to be a Mac-entric device.

      I suppose I should stop posting and just do some research...

    4. Re:IEEE 1394a by drinkypoo · · Score: 2
      I don't know if it's done now or not. I don't think I've ever heard of anyone doing networking through firewire without additional hardware. 1394b is supposed to include a peer to peer specification, this might not work properly until then.

      The connection isn't the problem in SCSIland anyway, the problem is actually software-related. Dual-attach is easy, you just connect to both ends and use different SCSI IDs on each controller. If you want to access narrow SCSI devices, make sure all HAs use an ID of 7 or less.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    5. Re:IEEE 1394a by Omega996 · · Score: 1

      ieee1394 devices are peer to peer; the problem isn't with the storage devices on firewire, it would be having an operating system that supports shared filesystems. I've been looking for something reasonably priced for a while; it's all pretty expensive...

    6. Re:IEEE 1394a by Omega996 · · Score: 1

      There are a lot of Firewire enclosures that have two ports - just have a look around. Even the cheapie units typically have two ports. Want to know what makes the SANCube special? The software it comes with to allow computers to share the open volume (the AccelWare bit). This seems to be most of the cost of the SANcube; the software alone is $895US per user... here's a link to the developer:
      http://www.attotech.com/fcaccelware.ht ml

    7. Re:IEEE 1394a by groovemaneuver · · Score: 1

      OK, so as soon as I finished posting, I did a search for firewire enclosures. Wouldn't it figure? Every single freaking box had two ports. That'll teach me to post without researching...

      So if it's mostly a software issue, would something like a firewire RAID box and GFS/OpenGFS work?

  4. my 2 cents by m0rph3us0 · · Score: 2, Interesting

    In theory this *SHOULD* work, however where I see it failing is upon switch over, personally I would look at hardware that is less likely to fail, ie. a Sun, IBM, etc solution (yes, its expensive), otherwise, i would just keep a spare computer on hand, when things die on server 1, remove external scsi connector from box A, plug in to box B, mount drive. or possibly rig up an electronic A/B switch that will trip upon signal from machine b, so when machine b can no longer contact machine a, it switches the a/b switch and mounts the drive, keeping them both connected to the chain at once will probably lead to serious problems in the event that machine a comes back or some other weird thing that is bound to happen. I would definately recommend a switching solution to having them both connected to the drive chain.

    1. Re:my 2 cents by slobberjaws · · Score: 1

      ooook mr. goldberg.....

      i personally think the guy with the pigtail idea is on the right track....

  5. Check the Adaptec line by Hyped01 · · Score: 3, Informative
    Many of the Adaptec cards support that. One identical card in each machine, and instead of terminating the end of the SCSI run, you run it to the second machine, configure the card and drivers, and viola! You are done!

    The way the manual reads, it seems it should work in all supported OS's, but I cannot confirm that.

    Rob

    --

    WebMaster:
    BinFeeds
    XXX Thumbnailed Image Newsgroups but

    1. Re:Check the Adaptec line by Anonymous Coward · · Score: 1, Funny

      viola!

      tuba!

    2. Re:Check the Adaptec line by sporty · · Score: 2

      if you could throw a tuba, i'd be impressed. guy was probably dodging one.

      -s

      --

      -
      ping -f 255.255.255.255 # if only

    3. Re:Check the Adaptec line by Anonymous Coward · · Score: 0

      Tubas rule. Esp the four valve rotory type!

    4. Re:Check the Adaptec line by photon317 · · Score: 2


      A lot of SCSI cards support this, and it does work with almost any OS. The problem is that "works" means that both hosts can see and access the block device. This doesn't in any way provide any of the synchronization neccesary to support a shared filesystem. For that, you need software like Global FileSystem. Apparenlty GFS was a GPL project, then it went commercial with Sistina, but there's now an OpenGFS Project that's picking up from the last GPL release and trying to make it work well.

      In any case, the hardware is easy, lots o fhwardware supports multiple hosts hitting a block device - the hard part is some sort of shared filesystem and/or block-level locking stuff.

      --
      11*43+456^2
    5. Re:Check the Adaptec line by Anonymous Coward · · Score: 0

      But the original poster doesn't need them both mounted concurrently, so this is the perfect solution for him.

  6. Cable length. by Trusty+Penfold · · Score: 0, Interesting


    You would need to make sure that the 2 cables are exactly the same length. If they aren't then you'll run into two problems.

    1) The obvious one; the signals will not arrive at the host computers or the disk at the same time. When the signal is going from the disk to the PC, this may not be a problem. When the signal is going from the PCs to the disk it is. If the difference introduces a delay of more than the time it takes to write 1 byte, then that information will be smeared across 2 bytes on the disk.

    If you're saving pictures this will result in a blurred image (This is Joke! It will actually corrupt all files)

    2) Less obvious, if the cables are different lengths then the signals may interfere when they meet at the controller. If a peak in the signal interferes with a trough in the other, then this will also result in incorrect data being written to the disk.

    1. Re:Cable length. by Anonymous Coward · · Score: 0

      Are you drunk?

    2. Re:Cable length. by Anonymous Coward · · Score: 0

      Bzzt. Sorry please try again. You clearly have no idea what the question is. And even if you did, your answer is way off. In other words, you're a fucking moron. Learn to read.

    3. Re:Cable length. by n9hmg · · Score: 1

      I hope you're just kidding. What you do is connect both systems to the disk array, using the card in each system as the terminator for the other. The standby peer LEAVES THE DRIVES ALONE until the active peer becomes unreachable. then mounts the drives. In higher-end systems (maybe even lower... i just don't know how there), the card standing by actually talks to the active one, and down there on that level, they can ensure that only one adapter is using the disks. If the active adapter answers a poll, the standby refuses to become active.
      Some drives, at least IBM SSA-attached ones, have 4 connections to each drive, bus A and B, port 1 and 2, I think was how they were named. These were usually attached multipath, so there was an SSA adapter on each end of the bus, in the same system. The second bus was either to a second pair of adapters in the same system, or in another system (or were unused). Serious redundancy, and with 4 paths to each physical disk, bus waits really weren't significant on reasonable-sized arrays. Anyway, as I said, you can do just the next level down with commodity SCSI hardware.
      There are ways to have independent systems share physical disk concurrently, but last time I worked on one (mid-y2k), it was pretty kludgy and unreliable.

  7. Bad Idea... by clifyt · · Score: 3, Informative

    As a musician, this is a common practice in my field -- one drive from sounds that the computer can access as well as the computer in the mix.

    Under ideal circumstances, SCSI can deal with multiple masters. I believe the SCSI standard allows for a lock on the drive until one is finished and then it releases the lock so that the next machine can acess this. However, in practice most drives don't deal well with this. I've seen good SCSI drives killed because of conflicting signals...all because the musician got impatient waiting for his computer to write a sound file and then letting his sampler pull these sounds.

    Again, these are much more plug and play than a Unix box will be, but the idea is still the same. If the SCSI driver on even on box ignores the lock the other master has, you've killed the drive.

    Hope that was some help...I shouldn't even talk about drives as I just killed the one for my site. From about 20k of hits a to 0k...all because I screwed up my own backup. Ok, back to trying to recover my Ext2FS partition...

    clif

  8. Reliability of the disk by 0x0d0a · · Score: 2

    I kind of wonder whether the server or the hard drive is more likely to fail, though.

    The way I see it, the only thing this avoids is kernel failure. If the server fails, you're better off having something to restart it and a single box. If the *disk* fails (IMHO, by far the most likely, unless you're running a pretty flaky bit of server software), you're out of luck either way.

    It seems like it might be a better idea to get two drives and one server (or two servers with two drives).

    Good to see a good "Ask Slashdot", too. :-)

    1. Re:Reliability of the disk by polymath69 · · Score: 2
      The way I see it, the only thing this avoids is kernel failure.

      I don't think this proposal avoids even that. If Server 1 and Server 2 are connected to Disk 1, and Server 1 goes belly-up, there is bound to be information in RAM cache that Server 1 didn't get to write back to Disk 1 before it went down, even it it syncs every millisecond, which would be horrible performance-wise.

      So when S2 detects the crash of S1, D1 is unclean and an FSCK is required before D1 can be cleanly remounted. That's going to take a while.

      So, the common case is software crashes, and the uncommon case is disk failure. This solution doesn't seem to save you much, if anything, in the common case, and saves you nothing at all in the catastrophic case. I think you'd be better off with one server which can be quickly rebooted, easily debugged.

      --

      --
      I don't want to rule the world... I just want to be in charge of mayonnaise.
    2. Re:Reliability of the disk by SuiteSisterMary · · Score: 2
      So, the common case is software crashes, and the uncommon case is disk failure. This solution doesn't seem to save you much, if anything, in the common case, and saves you nothing at all in the catastrophic case. I think you'd be better off with one server which can be quickly rebooted, easily debugged.

      Journeled filesystem, or, better yet, a filesystem that doesn't report a 'successful write' until the bits are on the hard disk.

      Similar to a ACID database; transaction logs vs data files.

      --
      Vintage computer games and RPG books available. Email me if you're interested.
  9. Try it, the only way. by bluGill · · Score: 3, Informative

    You have to try it yourself. Scsi supports it, and technically nothing can be scsi compliant if it won't work this way, but in practice... That is something else. I won't be at all surprized if one device fails to work that way, but a different from the smae manufacture does. So test your setup before you go to production.

    I've met people who claim to have done this, and even gone so far as half the disk used by one comptuer, half the other (seperate partitions), but those start to get into friend of a friend so I wouldn't put much faith in my claim that it has been done.

    Scsi cabling is still some of a black magic, but use good cables, no pig tails, good termination, and you should be fine. There should be no need watch for same length cables, just get the termination right, and follow the rules. Note that I said should, SCSI cables are still mystical enough that I wouldn't call you a fool for following rules that appear technically bogus.

    1. Re:Try it, the only way. by GigsVT · · Score: 2, Insightful

      Scsi cabling is still some of a black magic,

      No, it isn't.

      It's all normal signal theory.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    2. Re:Try it, the only way. by bluGill · · Score: 2

      In theory there is no difference between theory and reality. In reality there is.

      Scsi cabeling is much better than it used to be, and it should obey all the laws of physics (unfortunatly we do not know all the laws of physics, though we should know enough to solve this).

  10. Hook it up. Should work. by QuietRiot · · Score: 2

    Hook it up. Should work. You could ask Ancot Corporation about this... They sent me a free booklet a while ago. "The basics of SCSI" You may still be able to get one on their website www.ancot.com.

    Not all controllers or drives may be very excited about this setup, but I believe the standard says it should work. I know I've read about people doing it before (not sure about OS or hardware tho). Plug and chug. You should be able to find some combination that works, and since you aren't trying to mount at the same time from 2 machines - no problem.

    You may even be able to mount different disks to different machines on the same chain - share a scanner, tape drive, cdrom, or Zip drive even. Just give it a shot man....

  11. We're just not quite there yet by crstophr · · Score: 2, Interesting

    Solutions for sharing a disk amongst servers usually entail a SAN or fiber connection to the disks, and some really expensive software (read veritas volume manager and veritas cluster FS) to handle it all.

    In the linux world take a look at GFS.

    http://www.sistina.com/products_gfs.htm

    The hardware they use to make it work will probably support what you're trying to do. Your typical off the shelf (At Frys) SCSI controller won't do the trick.

    For what you're trying to do I highly recommend you work out some kind of sync between two networked machines with separate storage. If you're running a database it gets really fun. HINT for MySQL, script the replay of the SQL "update" log on the hot standby machine.

    Good luck. My company just spent 150k+ on a sun/veritas solution to do exactly this. Our storage is all SAN.

    --Chris

    1. Re:We're just not quite there yet by ivan256 · · Score: 2

      Good luck. My company just spent 150k+ on a sun/veritas solution to do exactly this. Our storage is all SAN.

      No offence, but your company spent too much. Your typical off the shelf PCI scsi adapter, in fact ANY scsi adapter that can set it's own ID works in a multi-host setup. If somebody from Sun or Veritas told you otherwise, they were lying. There are multiple companies (including the one I work for) that make software to manage the setup under linux. In fact our software is included in Debian 3.0 and RedHat advanced server, so you don't even need to spend any money on software unless you want bells and whistles (Graphical setup, support, NFS lock maintanance across failover...). It even comes with scripts to interoperate with your favorite database server.

    2. Re:We're just not quite there yet by ZenJabba1 · · Score: 1

      Does the word "DOH!" mean anything to you. This is available in free linux implementations with restricted subsets, and for a reasonable price for full implementations...

      Doh Doh Doh doh Doh Doh doh!

      --
      `find / -name "*your_base*" -exec chown us:us {} \;`
    3. Re:We're just not quite there yet by Tet · · Score: 2
      Your typical off the shelf PCI scsi adapter, in fact ANY scsi adapter that can set it's own ID works in a multi-host setup.

      While I'm sure you already know this, there's a big difference between having a "working" multi-initiator setup, and having one that doesn't corrupt your data. It's fairly easy to get it working in an active/passive setup. But to get both nodes actively accessing the same SCSI devices requires a little more care. DG (now EMC) CLARiiONs were great at this, even if they were somewhat pricey...

      --
      "The invisible and the non-existent look very much alike." -- Delos B. McKown
    4. Re:We're just not quite there yet by ivan256 · · Score: 2

      While I'm sure you already know this, there's a big difference between having a "working" multi-initiator setup, and having one that doesn't corrupt your data. It's fairly easy to get it working in an active/passive setup. But to get both nodes actively accessing the same SCSI devices requires a little more care.

      Agreed that once you want active/active you need to be a little more careful, but the cheaper SCSI adapters are actually more likely to work in these cases, because they have no cache on them. Once you're using RAID boxes or host based RAID, multi-initiator SCSI becomes a much harder problem. Almost every vendor that has redundant RAID controllers in their storage box does it correctly (Clariion included), but you really need to be careful with PCI RAID adapters. Most of them won't work as expected no matter what the configuration.

  12. I've done it although with different systems by FattMattP · · Score: 5, Informative
    I've done this in the past (early 90s) although with different systems. I play keyboards and used to own a Kurzweil K2000 which has a 25 pin SCSI port on the back. I had an external case that contained a 44MB Syquest drive and a 120MB (or something equally small) SCSI HD drive which was connected to my K2000. Rather than putting a terminator on the end of the connection, I hooked it to the SCSI port on my Amiga 3000. Since the K2000 used MSDOS format and the HD was formatted as such, I used the CrossDOS program to read and write to the drives from the Amiga. Both the K2000 and the Amiga could access the drive at the same time. I ran the setup like this for my music for over a year with no problems.

    I guess I'm saying that I don't see why it wouldn't work on today's GNU/Linux systems.

    --
    Prevent email address forgery. Publish SPF records for y
  13. network drive? by fist_187 · · Score: 2

    you've given very little background on your setup. where most people would try to spread one computer's data over several drives, you are trying to spread one drive over multiple computers. i have no idea why you would want to do that, but this is what i can offer:

    why don't you just find an extra comptuer and make an NFS server? the reason that you are not finding much information on sharing a SCSI drive is that there are a lot of better ways to do it. what sort of speed are you looking for? a 100Mbps network can deliver data comparable to having the drive attached locally, and you won't need an incredibly fast computer to serve it.

    --
    Somewhere on this page I have hidden my signature.
    1. Re:network drive? by toast0 · · Score: 2

      The reason he wants to share the drive over multiple computers is for redundancy... if the master computer locks up for some reason, the slave can become the new master and have the same dataset. Using an NFS server won't solve the problem.

      Also 100 megabit/second ethernet does not give speed comparable to having the drive attached locally... unless the drive only sends data at 10 megabytes/second (which is kinda slow these days)

  14. It *will* work, if... by shoppa · · Score: 2
    SCSI dual-porting absolutely works. It's been around since day one (well, since SCSI-1 in the mid-80's) and hardware-wise it's all nice and dandy.

    The difficulty you will have will be the software. You sound like you're not planning to have the same drive mounted on both systems at the same time, and that's good, and since you're using a Unix it sounds relatively simple to make sure that a drive is fully dismounted from one box before you mount it on the other. But very very bad things happen if, by some chance, both boxes do decide to mount a filesystem at the same time. If you have any sort of automatic failover between systems you have to be really really certain that the other box won't spring back to life and start writing to the filesystem while the other guy has it mounted. Supposedly reliable "failover" systems have this happen all the time if not designed correctly - remember, 99% of your failures will be software failures, not hardware failures, so if you design a hardware failover system without taking into account the flaky custom-written software you're making a mistake.

  15. Why you'd do this by Aniquel · · Score: 5, Insightful

    Yes, it's rare - but very valuable. Where I used to work we had about 4TB of hard disk space. Every disk (there were many - all SCSI, around 10GB each) was double-tailed. This allowed each disk to be connected to two controllers, and then each controller was connected to two mainframes. It's a redundancy thing - you're protecting against disk failure, host failure, and controller failure. For all those screaming NFS - all that does is move the problem. What happens when the hard disk controller in the NFS server dies? This way (ideally) if say somebody spills coffee on a hard disk controller (talk about a PITA), the disks are automatically switched over to the other controller. No down time.

    1. Re:Why you'd do this by shird · · Score: 2

      What happens when someone spills coffee on the HD? Wouldnt it be better to use something like NFS to mirror a system, this way you not only have two controllers, but also two HD's. Wouldnt that make more sense?

      --
      I.O.U One Sig.
    2. Re:Why you'd do this by Aniquel · · Score: 2

      Sorry, apparently I left out a detail. All disks were part of a RAID 5 set (many different RAID sets). So we're protected against coffee pourance on hard disks, hard disk controllers, and (oh dear god please don't let this happen) on the mainframes.

  16. Multi initiated SCSI Array? by jsimon12 · · Score: 2

    Sounds like you just want to multi-init, I know the Adaptec stuff will let you do this (2940 and above?). Just look on google for multi initiated scsi or this

  17. What is the point? by maunleon · · Score: 1

    I mean honestly, what is more likely to die, a PC or a hard drive? I don't think this has been thought through all the way. It would at least have to be a shared raid array, not just a single shared drive. Preferably hot-swap.

    Hell of a good will your two redundant servers do you if your hard drive decides to take the day off.

    Depending on the data shared, it may be safer to replicate and set up some sort of load balancer.

    1. Re:What is the point? by ivan256 · · Score: 2

      I mean honestly, what is more likely to die, a PC or a hard drive? I don't think this has been thought through all the way. It would at least have to be a shared raid array, not just a single shared drive. Preferably hot-swap.

      Actually, what's most likely to fail is the software, but NIC failures, accidental cable pulls, and other hardware failures do happen. The trick is setting it up so there is no single point of failure. There are lots of papers available on the web that describe how to do this, and many of them talk about how to do it cheaply. You can set up two systems with no single point of failure (Redundant shared SCSI driver, host based RAID, dual NICs in each system, remote power control) and automatic failover for under $2500.

  18. Been there, done that. by lotussuper7 · · Score: 1

    VAX/VMS (Ugh, a system I really hate, even more than Micro$oft) supported exactly the configuration you are talking about.

    1 SCSI bus, and 5 devices shared between the two systems. (Tapes, disks, CDs, etc. with each systemn using a different SCSI ID.)

    Of course, the systems had distributed locking (also done over the SCSI bus) allowing full access to all the devices at the (nearly) same time.

    In terms of hardware, the only things you need to watch for is going over the allowed bus length and no extra terminators along the bus.

    But, all of this is moot, as disk prices have fallen so fast and so far, that it doesn't make much sense to worry about all the operational problems you will have to solve. This was a reasonable solutin when a 1 gig drive was a few thousand dollars, but today a multi-gig disk can be had for pocket change.

    You probably would be better off using some form of shadowing disk software between the two systems. Backup, operational simplicity and support are a lot more important today than the cost of just one extra drive.

    --
    ----- Lotus Super 7 - A real car. :-}
    1. Re:Been there, done that. by chunkwhite86 · · Score: 1

      How can you hate VMS? Talk about a powerful and reliable system - VMS had it all! And it ran on the the best hardware at the time - DEC Alpha. I like linux as much as the next /.'er, but Linux has nothing on VMS when it comes to reliability and ease of use.

      --
      I'd rather be a conservative nutjob than a liberal with no nuts and no job.
  19. Possible, Easy, Reliable, and -FREE-! by ivan256 · · Score: 5, Informative

    You're talking about making a shared storage HA-cluster. The company I work for makes software to do exactly that, and for typical applications, you can get it for *FREE*. Go to oss.missioncriticallinux.com and look at kimberlite. It makes sure that only one system is using the disk at a time, and automatically switches to the othe machine when one breaks. It's also well documented, and the engineers that work on are good about responding to questions over e-mail.

    If you use debian, installation is as easy as apt-get install kimberlite. If you want to use it as an NFS server, you'll need to buy the commercial version for full support, but it's not very expensive.

    Ignore the people in this thread who are talking out of their asses and saying multi-host scsi doesn't work well. They just didn't know how to set it up right or have never actually tried it. It's very common, and people have been using it for decades.

    1. Re:Possible, Easy, Reliable, and -FREE-! by ader · · Score: 1

      Ignore the people in this thread who are talking out of their asses and saying multi-host scsi doesn't work well. They just didn't know how to set it up right or have never actually tried it.

      As someone who once worked for an also-ran in the Linux HA field, I can back this up. My impression is that the MCLinux people know what they're doing (although I haven't tried the product). I believe they recommend using remotely controllable power switches so that one server can kill the power to the other in the event of failover, ensuring that a dual-mounted filesystem cannot occur; this is a sign that they've thought about the overall solution.

      If you're going to implement a manual solution (and you're careful) then you don't need to worry about this (but HA vendor manuals are still useful for the hardware setup details). If you decide to employ failover software, get a good book on HA (Marcus/Stern is recommended) because there certainly used to be a lot of FUD amongst vendors.

      Yes, you want a journalling filesystem, otherwise recovery times could be horrendous.

      Ade_
      /

      --
      Big Bubbles (no troubles) - what sucks, who sucks and you suck
    2. Re:Possible, Easy, Reliable, and -FREE-! by ivan256 · · Score: 2

      I believe they recommend using remotely controllable power switches so that one server can kill the power to the other in the event of failover, ensuring that a dual-mounted filesystem cannot occur; ... If you're going to implement a manual solution (and you're careful) then you don't need to worry about this (but HA vendor manuals are still useful for the hardware setup details).

      Because we're talking about failure situations, there is no guarantee that a failed node will be well behaved. There absolutely MUST be some sort of I/O barrier preventing the failed node from corrupting your data. Other vendors use SCSI reservations, which can be just as effective, but is more difficult to work with. Bottom line: use either the power switch option, or the reservation option, but do something or you'll be sorry.

    3. Re:Possible, Easy, Reliable, and -FREE-! by Anonymous Coward · · Score: 0

      Presumably the "manual solution" involves shutting off the failed host, or better yet yanking its hot-swap connection to the bus.

  20. How to do the failover... by Ayanami+Rei · · Score: 3, Interesting

    Note: I have never tried this before. Try it on a non-production machine first!!! you have been warned...

    On the backup machine, write a script that repeatedly does the following actions:

    1) mounts filesystem on shared disk read-only
    2) if the mount fails becase of an inconsistency, skip to 9
    3) checks the mdate of a file called /.watchdog
    4) determines if "too long" a period has gone by since that
    time... if not, go to 8
    5) remounts the filesystem read-write
    6) creates a file called "/.failover"
    7) starts the application assuming the other computer has died, stops this script loop
    8) umounts the filesystem
    9) sleep for a short period of time
    10) go back to 1

    The main machine does the following things in a loop:
    1) Update the date of /.watchdog
    2) sleep for a short time (shorter than the one in the above loop)
    3) Check for the existance of /.failover. If it exists, panic! This means the other machine decided to take over. Ideally you umount everything EXCEPT that disk and halt.

    Now, a better idea might be something like this:
    Create a small partition on the disk (1 cylinder) in addition to the shared partition.
    Have the main machine write timestamps directly into the partition (date +%s > /dev/hdz3 or something). The backup
    machine would read that directly rather than trying to
    syncronize on a file (whose mtime will only be updated when
    the main machine's buffer cache is flushed to disk).

    Also, you may want to consider some way to avoid needing a script loop on the host machine; a custom device driver that fits into Linux's watchdog timer framework is probably better.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:How to do the failover... by iamcadaver · · Score: 1
      Ok, I'm in the "I've tried this" catagory.

      I'm still trying to get it to work. The idea is to have two mail/dns/dhcp servers, one a failover for the other. The problem is indeed in the details. The filesystem cache gets in the way every time. You can't check the mtime of, say, /data/.watchdog. I've tried mount flags, I've tried tune2fs tricks, nothing seems to work.

      I still hack on the setup from time to time. Maybe I should _NOT_ use EXT3, but then journaling the metadata is EXACTLY what I want, so that in a catastrophy all I need to do is clear out the journal log before remounting rw. Maybe I should just adopt the tried and true serial line heartbeat monitors.... Even with that, the fun starts when the other machine decides to come back online, and steals both the drive and the IP back.

      Slashdot is spooky sometimes, I was thinking of working on the problem today, and then saw this thread. b)

      --
      Before I part with'em: two pennies weigh ~4.996+/-0.014g, have a zinc core, and the face of Lincoln. You can keep 'em.
    2. Re:How to do the failover... by SuiteSisterMary · · Score: 2
      5) remounts the filesystem read-write

      Actually:
      5a) kills power to the primary machine using a serial/networked power bar to avoid any possibility of the other computer doing something like trying to mount the FS
      5b) remounts the filesystem read-write

      Slightly cleaner.

      --
      Vintage computer games and RPG books available. Email me if you're interested.
    3. Re:How to do the failover... by ocelotbob · · Score: 1

      I'm not a guru in HA, but what about setting up a script in your inittab to check to see if the box is getting a heartbeat signal from the other system? If it is getting a heartbeat, then it'll know that it's the fallover system, and to wait around until the other system flatlines.

      --

      Marxism is the opiate of dumbasses

    4. Re:How to do the failover... by iamcadaver · · Score: 1

      ..and when the power goes out, and both machines come up at the same time?

      There are so many details, too many devils. If you make one a prefered master, then have it STOMITH ( shoot the other machine in the head ) then it at least stops the race. STOMITH is usually a hardware device, killing the net connection or the power.

      --
      Before I part with'em: two pennies weigh ~4.996+/-0.014g, have a zinc core, and the face of Lincoln. You can keep 'em.
  21. consistency checks will take too long by Anonymous Coward · · Score: 1, Interesting

    This is probably not a good idea. Sure SCSI supports this in theory, however in the real world, where drives fail and servers crash, this will prove to be impractical.

    If availability is your goal, this is not the way to get it. Get a good journalling file system, hardware RAID and then just replace a drive if it fails. You will find that bringing your one server back online will be faster than managing the switchover when a primary server fails. If you're running a database, the issue will be even more pronounced as any switchover will require that the server perform a consistency check on the database as well.

  22. Make sure to use a STONITH device by Anonymous Coward · · Score: 0


    It's doable (look at the ha-linux project for others doing this) but make sure that you can enforce only one server using it at a time.

    Probably the most reliable way to do this is to use a STONITH device (short for Shoot The Other Node In The Head) which is used to cut power to the other node before mounting the disk yourself.

  23. The HD is the least reliable part - vermillion by Anonymous Coward · · Score: 0

    You want to share the least reliable part of a computer between two computers?

    Sigh.

  24. High-Availability File Server with heartbeat by Anonymous Coward · · Score: 0
  25. Does anyone else remember... by 3waygeek · · Score: 2

    Dan Lancaster's (of TTL Cookbook fame) column in Computer Shopper? He wrote quite a bit about on-demand publishing back in the 80s & early 90s, and talked quite a bit about "shared SCSI comm" -- basically a SCSI drive connected to both a computer & a printer.

    1. Re:Does anyone else remember... by RobKow · · Score: 1

      Don Lancaster, I do remember. And I remember him talking a lot about how his setup was an Apple II and a Laserwriter ;)

      You can still find him at http://www.tinaja.com/, wacky as ever.

  26. Be careful who you listen to by Gerry+Gleason · · Score: 5, Informative
    There are a lot of very wrong answers in the comments here, and some good ones. I haven't messed with this in detail since the SCSI-I days, but the spec was designed to support this from the very beginning. On the other hand, this configuration is pretty rare, so not all drives and host adapters are going to handle it properly (test the devices you want to use).

    As someone else said, you want to look at "multi-initiator" support. Since there's not much point to using SCSI if you can't interleave requests, your going to be talking about "split transactions" where the initiator arbitrates for the bus, selects a target and sends a command and possibly data (write case) over the bus and then disconnects. Later, the target arbitrates for the bus, selects the initiator (hopefully the same one that sent the request), and sends data (read case) and status back. IIRC, SCSI-I didn't support tagged queing and out of order returns, but later versions do. This has got to be negotiated just like synchronous transfer rate. I can think of lots of ways that this could be screwed up (typically in firmware) and never effect the single initiator case, so as I said, you have to test.

    If the drive fully and correctly supports the spec, it should respond correctly to requests from any initiator and keep everything straight when it agrees to handle tagged queing. That means you should be able to use different parts of the disk for a filesystem on each disk, as long as you keep everything straight. You can even have one device write and another read, or use some blocks on the disk to coordinate dynamic sharing, but all of that gets complicated quickly, so unless this is what you really want, it won't be worth it.

    A couple of comments implied that some music systems do this sort of thing, maybe between the sound recording system and a computer mixer/processor system. Doing this can't break the drive, but it certainly could hose up the format enough to make it unusable without a reformat (if you break the usage rules, that is).

    As to cables and such, SCSI is a bus, although you are allowed short taps from the bus to the drives/controlers (maximum is in the spec). If you have some sort of 'Y' cable that connects a host in two directions, you can't have more than one device inside the host (i.e. no drives inside the case connected to the internal port of the controller), and the internal cable has to be short enough (and of course no termination inside either). External drives and multi-drive modules will almost always have two connections for both ends of the bus, so just chain all the drives together and put the hosts on each end. Now you just have to be sure the total cable lenght is within spec (6 meters, I think).

    The final topic is why do it in the first place. Keep in mind that drives and power supplies are your most likely failure points in any case, so you want to mirror, or raid. Mirroring with one drive in each box (or many pairs split between the boxes) would reduce the single points of failure pretty well. You could even have both boxes active and mirroring to different pairs sharing the load until there is a failure, then switch over. Manual switch over is probably safest and cheapest, just shutdown the broken system (If not already hard crashed), and mount the other filesystems on the still working box. If you have confidence in your monitoring system, you could script this on certain events.

    It looked like some comments had good links to some multi-initiator stuff, or just google that as suggested (it helps when you know what to ask for), YMMV. Oh, one more thing to worry about: terminator power. Usually the controller supplies it to the bus, but it is very bad for more than one device or initiator to supply it. Of course, you also have to worry about still having it at both ends even if one of the machines is off or dead.

  27. no by wotevah · · Score: 2, Interesting

    I seriously doubt this. I never heard SCSI was sensitive to cable lengths (within spec of course). The data goes in a buffer anyway, it's not like it's written to the media on the fly.

    1. Re:no by King+of+the+World · · Score: 1

      No, like Dude! If you don't get enough data down the cable it will just not use half the disk. I just have my data on "rotation" so such a scenario will never happen!

  28. I wouldn't recommend it. by TheSHAD0W · · Score: 2

    First off, the hard drive is likely to be the weakest link in your setup; making two separate processors depend on the same drive won't give you a lot of redundancy for your money.

    Secondly, when the primary machine goes down, it may take cached disk information with it, so your secondary system will need to perform a fsck before mounting the drive, and the lag time probably won't help your situation.

    What I would recommend is two separate systems, each with its own IDE (or SCSI) drive, and a gigabit network adapter in each machine. (I'd recommend using this as a secondary to your uplink ports, to make security easier and keep the bandwidth open.) Have the primary machine mount the secondary's drive over the network and mirror everything as it's written.

  29. Set the ID of the SCSI cards to be different by iankerickson · · Score: 2

    This is an old idea. "Poor man's clustering" is what they call it.

    The essential trick that you may not think of yourself is to set the SCSI ID of the 2 SCSI host adapters to _different_ SCSI IDs. Most people forget this. Remember, the PCI SCSI you use takes 1 SCSI ID in the chain, even if it's on the motherboard. So if you connect 2 PC to the same SCSI chain, the ID of each PC's SCSI adapter needs to be different, otherwise it's no different than having two hard disks both set to ID 3.

    2nd, make sure you terminate both ends and put both PCs inside the termination.

    So your chain should look like this:

    T-P7-6-5-4-3-2-1-P0-T

    Where T is a Terminator, a number is a SCSI ID, and a P designates the SCSI adapter in a PC.

    Good luck, and make sure you have enough goats! ;-)

    --
    Democracy. Whiskey. Sexy. Pick any two.
    1. Re:Set the ID of the SCSI cards to be different by Wolfrider · · Score: 1

      --Now THAT comment deserves a Mod Up!! I even understood his termination example on first glance.
      .

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
  30. Horse pucky by itwerx · · Score: 2, Informative

    This is a troll!!!
    (Or the gentleman is painfully ignorant).
    Having done this myself in the real world I can say with complete authority that one should definitely use cards which support this configuration (e.g. Adaptec's). The reason being that these cards will actively negotiate which one has access to a given device at any particular time.
    If you don't have cards that support this (which I didn't, so I found out the hard way) the SCSI devices will get confused and hang if they're accessed by both cards at the same time. Interestingly enough it did work, I just had to be careful what I did on the two machines.
    (Better just to get the right cards and not have to worry about it constantly).

  31. removable disk? by Toraz+Chryx · · Score: 1

    Personally I'd be inclined to have a drive in a removable (hotswappable- of course) caddy and manually moving it between machines if the box fell over..

    That or a full blown SAN. but I'm all-or-nothing like that :)

  32. An easy part and a hard part by dublin · · Score: 2

    Hooking up SCSI devices twin-tailed to a pair of servers is not exactly rocket science, it's done every day. But if you just do that, all it's good for is backup servers connected to the same disk.

    Keep in mind that although the electrical connections are OK (so long as only one thing is talking at a time on the SCSI bus), the filesystem is a different matter entirely: Without some sort of distributed lock manager, your data WILL get horked. Generally DLMs are part of larger packages like GFS, AFS/DFS, Coda, or Veritas ClusterFS. Tivoli's SANergy is probably the closest thing to a standalone product to do this, although there are others - I haven't looked a the market in nearly a year.

    Filesystem consistency may be a serious enough problem to keep this approach from even being valuable for backup servers: If one server goes down unexpectedly, it leaves the disk in a corrupted state, which must first be fixed with fsck or the like. If you have ot wait for that anyway, then there's not a whole lot of advantage to all that extra cabling and the weirness that accompanies SCSI length.

    Generally, the three best solutions today for this sort of thing are 1) Cheap, easy: to use external RAID boxes and just switch then over physically to a backup server, if required, 2) to use iSCSI or other Storage over IP (SoIP) (or NAS, if you don't need performance) to allow disks to be easily reconnected, or 3) buy a fully virtualized SAN-type solution (which ay be SCSI, Fibre Channel , or SoIP) that will allow you to re-connect everything in software - some of these can work with distributed lock managers.

    If you really want to do this sort of thing, do it right: check out FalconStor or DataCore, or HPAQ's VaporStor, I mean, VersaStor... :-)

    --
    "The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post
    1. Re:An easy part and a hard part by Monkelectric · · Score: 3, Funny
      your data WILL get horked

      Great word man, Im gonna add that to my lexicon :D

      --

      Religion is a gateway psychosis. -- Dave Foley

    2. Re:An easy part and a hard part by dublin · · Score: 2

      I wish I could claim credit for originating it. Although I heard it before, I think it's been enshrined as Mozilla Bug # 127856: "Huge bookmark file horks my profile - uses all system resources"

      A truly nasty bug, and one that continues to bite those that bounce back and forth between various Mozilla/Netscape derivatives foolishly thinking they can use the same profiles. Sounds reasonable enough, but it can't be done reliably today...

      --
      "The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post
  33. Re:Hook it up. Should work. by 0x0d0a · · Score: 2

    Wow, the book is up to the fifth edition -- I have the second (IIRC) edition around somewhere. Plain blue cover...

  34. Been there, done that, hated it. by Black+Copter+Control · · Score: 2
    My experience was doing this with Solaris boxes... I'll say up front: The setup caused more failures than it prevented. Tread carefully.

    In our case, we used Fibre Channel, but SCSI doesn't see anything interesting about controller vs device, so you should be able to have multiple machines connected to one SCSI chain. Machines at the end of a chain should be properly terminated.

    We also used 'canned' failover software It basically had a committed channel between the two boxes where they talked to each other and fibured out who was up and who was was 'active' (kinda like the protocol used by timed (( BSD protocol before ntpd)). If the 'active' box died, then the backup box would take over as the server -- this included stealing the MAC and IP addresses and the disks.

    Obviously, if the backup machine thought that the primary was dead when it wasn't, then all hell would break loose (yes, I had it happen to me).

    Should you accept this mission, a journaling FS is obviously the better idea (faster FSCK before restarting the disks). -- and you REALLY want to make sure that the other machine is really down before the backup system grabs hold of the disks. IMHO, you're better off to err on the side of caution... Far easier to recover from the backup machine backing off from failover than trying to figure out what got destroyed by both machines writing to the same disks.

    My best suggestion is to find some hardware hack to allow the two machines to pull each other's reset lines low. That way you can avoid the pathalogical case where the primary machine stalls long enough for the secondary to think it's dead, then coming to life thinking that it's still primary (zombie servers -- appropriate for halloween night, don't you think?)...... Instant toasted disks.

    Beyond making sure you don't end up with zombie servers, there shouldn't be anything special for Linux to do... Just FSCK the disks and mount them.

    --
    OS Software is like love: The best way to make it grow is to give it away.
    1. Re:Been there, done that, hated it. by RobKow · · Score: 1

      That's because you were doing it without a STOMITH (shoot the other machine in the head) device. One of the simplest and most reliable is a computer-controlled power-switch. When one computer detects the failure of the other, it cuts the power before it takes over its duties. Voila, no more conflicts.

  35. Avoiding Single Point of Failure by korpiq · · Score: 2
    NFS mirrors are not real-time (as are RAIDs (1,4,5 at least)). You are not talking about the original problem, which implied targetting High Availability.

    Consider a situation where you have (a crude ASCII graph slashdot's lameness filter does not let me pass thru, depicting ~)

    • app servers A1..An, polling each other and sharing tasks
    • local networks N1..Nm, each connecting separate NICs from each server
    • file servers F1..Fo, one active, others ready to take over
    • SCSI wires S1..Sp
    • RAID1'd disks D1..Dq, all being mounted by any one file server at a time


    where n,m,o,p,q are integers bigger than one.
    Each of the above is independently connected to each device in the next group.

    Now take away any all but one machine from each group (stupid luser access, sudden administrator movement, coffee pourance, spontaneous smoke escapitation event, divide intervention, anything you can come up with as long as it is considered a Fatal Failure on behalf of the conserned device). Does the system fail?

    (Examining other setups of similar reliability is left as an exercise for the reader, except for that one who's already fed up with my style of writing.)

    This is what the original question is about. I find it quite interesting that such a setup apparently could be achieved with commodity hardware and Free software.

    Of Course you need off-this-machinery-and-rather-off-the-continent backup. It without the former, however, does not HA make.
    --

    I think, therefore thoughts exist. Ego is just an impression.
  36. Shared SCSI bus pitfalls by Rogan · · Score: 1

    Sharing discs on a SCSI bus between machines is quite possible - I
    have done it on both Sun/sparc & Linux/x86 machines. There are a
    number of things to watch out for when trying to do this...

    A SCSI bus is just that - a bus, which needs to be terminated at both
    ends. Each device on the bus must have a separate address. This
    includes the controller board - sometimes called the SCSI initiator.
    As supplied by the manufacturer a controller will normally be set to
    the highest numbered address on the bus - 7 for a narrow (8 bit) bus,
    15 for a wide (16 bit) bus. When connecting two controllers to one
    bus, you must change the address of one of the controllers.

    Things to check include:

    Can the initiator ID be changed on the controllers you are using (it
    can on the Adaptec 2940, I don't know about other boards).

    Can the controller & device driver cope with unexpected events on the
    bus ? eg. if one machine does a bus reset (perhaps during a reboot),
    does the other machine carry on ?

    Are both ends of the bus properly terminated ? If one machine is
    powered off, will it fail to correctly terminate it's end of the bus ?

    It is possible for both machines to access the disc, and indeed having
    different partitions mounted on different machines will work, though
    throughput may be poor (think of what happens to the seek scheduling
    algorithms when another machine is also accessing the disc). I am not
    aware of any filesystem which will cope with two machines accessing it
    at the same time. Trying to do this is a great way to get a corrupt
    filesystem.

    It is possible to unmount a filesystem from one machine, & then mount
    it on the other. When doing this be very careful that the disc &
    filesystem caching doesn't mess things up. It's not just a matter of
    flushing the write cache on unmount - a read cache which persists
    through unmount then mount will also cause problems. If this cached
    data is wrong because another machine has changed what is really on
    the disc, filesystem corruption can result - I have seen this happen.

    Good luck !

  37. Paradox by sql*kitten · · Score: 2

    I'm looking for a (cheap) solution for filesystem sharing between two linux servers and, since the target is just redundancy, I've come to the following idea

    Before you spend a single dollar, ask yourself: if your system is important enough to require fault tolerance, why can't you spend money to get a professional solution? If your system isn't important enough to spend money on, then ordinary bidirectional file replication should be good enough for you. You could do it with rsync and ntpd in a few minutes, for free.

  38. DRDB network raid system anyone? by synq · · Score: 2, Interesting

    I'm building a heartbeat cluster to serve WebGUI pages and files via samba.

    This going to be presented at a congress for the Netherlands Network User Group November 13th (a mostly Novell and Microsoft NT association).

    I have been looking for a solution to mirror files between the two cluster nodes. SCSI is just too expensive for this, since low cost is one of the requirements. I've been trying to compile DRDB on my gentoo 1.3 systems but the 2.4 kernel isn't supported by the default DRDB distibution yet.

    Does anyone know about any other projects like these that actually work?

    --
    sig not found
  39. You will need GFS by j-turkey · · Score: 3, Informative

    If you're going to share a disk/fs between multiple machines, you will need a filesystem capable of performing proper file locking in order to avoid data corruption and race conditions.

    Global File System (aka GFS) can do this. I believe that it was originally developed under a OSS license, but eventually went commercial. There's rumors of a GNU/GPL GFS (called OpenGFS) but I don't have many details as to the maturity of the project, or any experience with it at all.

    I found GFS's learning curve to be pretty steep, but if I was able to set it up, I'm sure that you can work through it.

    Lastly, I have only used GFS with a SAN cluster, connecting multiple machines via fabric fibre channel (you might want to consider into using a third box as a RAID host). I know that you are using a very different solution than I did, on a different budget -- so YMMV.

    I hope that this is helpful to you.

    --Turkey
    --

    -Turkey

  40. I've done it lots: It works by MrRobahtsu · · Score: 1

    I have a test system that's a homebrew with an ancient DEC SCSI box shared between two 1U VA Linux boxes. It was running with OpenGFS for while so both boxes simultaneously mounted the partitions. We used nice ($199 ~1 year ago) sym53c8xx SCSI cards. They even have settings in the card BIOS to change the host ID and minimize bus resets for clustering. Nice.

    Now we only mount one at a time using FailSafe to detect failure and handle fail-over.

    If you really want reliability, though, you have to put the external storage behind an external (redundant, of course) RAID controller(s). Or just buy a Compaq cl380. They run Linux just great and everything is all set up.

    For testing the software, etc., use ieee1394 because it is MUCH less expensive than SCSI.

  41. My use for this... by Muad'Dave · · Score: 2
    would be having one giant honking disk shared between several machines, each using a different partition. That should work, right? With a 120GB disk, I could have 6 machines with their own 20GB partition. For compute-bound clusters, the I/O throughput would probably be ok.

    --
    Tiller's Rule: Never use a word in written form that you've only heard and never read. You will end up looking foolish.
  42. The real question is what you're trying to do... by Carpathius · · Score: 1

    I can't get from your question exactly what you're trying to accomplish. It *sounds* like what you want to accomplish is redundancy in the case of a server problem not related to the disk.

    The problem I have with this scenerio is that server hardware problems are much less likely to occur than a disk problem, and I think you'ld be *much* better served by using RAID and mirroring your disks than worrying about a non-disk problem in your server. Using RAID you come close to solving the problem of the disk being a single point of failure, and given that disk problems are more prevelant, it's generally a better choice than redundant machines.

    That's a smaller cost than buying a second server, and I suspect it'll give you better results, if for no other reason than what you're suggesting is a fairly non-standard configuration. (Regardless of whether or not it is supported.)

    That doesn't solve the possible problems, but it at least places the most likely single point of failure in the right place. To really solve the problem is to make totally redundant systems, but that gets much more expensive.

    Sean.

  43. ICP Vortex by Captoo · · Score: 1

    Check out the ICP Vortex cards. They are generally well supported by Linux. The part number will be in the format GDTwxyzRx. Any card with 6 in the x position (e.g. GDT8623RZ) supports clustering like you described. Only one server will see the drives at any given time. If the primary server fails, the secondary one will automatically take ownership of the drives.

  44. Man the level of bad advice by 1101z · · Score: 1

    The answer is yes this is supported. Even with software RAID array of disks. This is what all the Linux High Availability stuff is all about. Meaning you could have a mirrored array of disks in case a disk fails and in case a server fails. Having the only single point of failure being the SCSI bus termination but auto self-termination might fix that. There was an as an AC already stated there was an article at sysadmin journal [samag.com] about doing it with a single disk and using heartbeat. So all you realy need is a disk, cables, and a serial cross over cable between the two systems. Make sure to use a journaled file system to save on fscks.

    --
    One day people will learn the folly of Winbloze, Linux Rules!
  45. High-Availability File Server with heartbeat by euph436 · · Score: 1
  46. NFS by j_kenpo · · Score: 2

    Im currious as to why NFS was rejected, as it is supported in Linux, doesnt cost anything since it comes with Linux, and doesnt seem to have any issues other than security with RPC. We used to have a system with an NFS drive that about 4 people would play music off of the drive and didnt have any performance issues....

    1. Re:NFS by chunkwhite86 · · Score: 1

      Perhaps you didn't have any performance issues, but keep in mind that on 100Mbit and Especially in Gigabit Ethernet, TCP/IP has a LOT of CPU overhead. Add to that the packet overhead incured by NFS (which is also a LOT) and pretty quick your performance is anywhere NEAR what it would be when using local storage. Particulatly if your using older machines e.g. even a P3-733 is quickly up to 100% cpu utilization (that's TCP/IP's fault) when you have a Gig-E card running at full tilt... Once we see the TCP/IP stack implemented in hardware, only then will Ethernet be a contender in high performance network storage.

      Although you'll only likely notice these things in disk intensive and/or HPTC applications, its always a Good Idea (tm) when choosing connectivity options to keep any protocol overhead to a minimum.

      --
      I'd rather be a conservative nutjob than a liberal with no nuts and no job.
  47. Linux-HA by pheared · · Score: 1

    www.linux-ha.org

    Lots of information on using shared storage with a bias toward setting up highly available clusters.

  48. Re:IEEE 1394a (Multiple-host File Systems) by MarcQuadra · · Score: 1

    But filesystems are designed for single-host access. You need to have an advanced database-like FS to connect multiple hosts, I currently don't know of any consumer-available file systems that can do this. Perhaps it could be hacked into linux with a 'token' for writing, bigger buffers, message passing between hosts (locking and cache concurrency), and a bevy of other things to make it work. A better solution IMO is to build a databaseFS layer or module and rework an advanced filesystem (reiserFS?) for multiple access. You'd still need a FS daemon running on each host to broker transactions without hosing the disk. I am not a developer, but this seems like a reasonable way to handle it.

    --
    "Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
  49. Oracle Real Application Clusters by mrkrittman · · Score: 1
    This is what you need to work with Oracle 9i's Real Application Clusters.

    This is a technology that allows you to set up several commodity intel boxes (or solaris, or whatever) as a cluster, with a shared storage device to hold the data files. The clever bit is that it appears to all intents and purposes to be a single instance of the database, meaning apps don't have to be rewritten to take advantage of clustering.

    The kicker though is trying to source a shared storage unit for less than £50k. All quotes from Dell (our supplier) are for fibre-channel devices that cost a fortune, but I know deep down that we can accomplish this with a SCSI unit with simultanious connections to each server. The Oracle RAC software takes care of the synchronisation between writes to the disks, so things shouldn't get out of sync.

    I'd be interested to hear if anybody has been able to source a shared storage SCSI unit, and in particular which brand etc. I'm trying to set up a low cost RAC cluster using Dell PCs, SuSE SLES-7 and the Oracle software, and I need the storage solution to be cheap as well.

    1. Re:Oracle Real Application Clusters by 1101z · · Score: 1

      Well Orcale released some patches for linux so that you could run a RAC cluster using ieee1394(firewire), thay say it is just for low cost testing. Ofcoarse firewire limits your disk throught put to 50 megabytes per second but if you are looking for Ultra cheap that is the way to go.

      --
      One day people will learn the folly of Winbloze, Linux Rules!
  50. Use STONITH by supton · · Score: 1

    Stands for "Shoot The Other Node in the Head." Means: a device that kills power on the peer node as a means of data fencing (this is your guarantee in a Linux-HA cluster).

    A WTI RPS-10M is an ideal unit for this: it is a power switch controllable with a serial port. You can even chain up to 10 of these together with phone lines and control them with one serial port.

    There is lots of info on STONTH on the the Linux-HA site.

  51. IEEE1394/FireWire by SiMac · · Score: 1

    IIRC, FireWire should be able to do this. At one MacWorld, I seem to recall Steve Jobs plugging a camcorder into two machines and simultaneously downloading video to both...Then again, it's possible that this sort of thing wouldn't work with hard disks.

  52. Done it, but wouldn't recommend it by MeerCat · · Score: 2

    We did this with what was then a huge SCSI disk storage (1.2 Gb shared between PCs in 1987 - just before the first 386 PC's came out) and SCSI supported it then. But SCSI won't protect you from conflicting update problems, so unless your OS disk sub-system understands what's going on, you'll have to use some discipline to make sure only one host is writing to the disk at once. You say the other machine is just for failover, so I'd suggest you tell the "failover" machine to mount the drive read-only, and then unmount and re-mount it RW only when you're actually failing-over.

    Or, if you're after a cheap solution for failover (and it sounds like you'll be doing a manual failover) I'd just use external devices plugged into a SCSI card, and if you need to failover, manually unplug the disk from one machine and attach it to the other and boot it up. Not quite "hot standby", but quite warm...

    --
    I spent a lot of money on booze, birds and fast cars. The rest I just squandered. - George Best
  53. Why not RAID? by Anonymous Coward · · Score: 0

    If you're only looking to have a mirror, why not just use RAID? RAID in the 2.4 kernels is pretty good, I have used it pretty extensively. Plus, you can have extra spare-disks in case. If you're worried about the box, though, you've got bigger problems...

  54. Best. Acronym. Ever. by Anonymous Coward · · Score: 0

    Thank you for that, sir. You've made my evening.

  55. Re:You need a pigtail - or two SCSI cables by extropalopakettle · · Score: 1

    Haven't done it under Linux, but NT, using two self-terminating SCSI cables.

  56. Re:Bad Idea... Need good software. by Havokmon · · Score: 2
    Actually, Netware 6 provides this out-of-the-box.

    Basically the servers monitor each other, and if the server that has mounted the drive goes down, the 2nd server picks it back up. (Oh, you only have to buy one server. Mirroring licenses are built into the product)

    We streamed a video off the disk, then downed the server, and after a couple seconds the video picked right up where it had paused..Very cool.

    Of course, that's actually while working on a third workstation....

    I know this isn't helpful to the topic (Linux solution needed), but many people don't know it's possible.

    --
    "I can't give you a brain, so I'll give you a diploma" - The Great Oz (blatently stolen sig)
  57. Sure it will by Havokmon · · Score: 2
    The hardware they use to make it work will probably support what you're trying to do. Your typical off the shelf (At Frys) SCSI controller won't do the trick.

    Dammit! People need to stop ignoring Novell.

    Building a Poor Man's SCSI-Based Cluster Hardware System

    There's much more information buried on their site, of course it applies to NetWare, but just because you don't have a Linux answer, doesn't mean it doesn't exist at all.

    --
    "I can't give you a brain, so I'll give you a diploma" - The Great Oz (blatently stolen sig)
  58. highly dangerous without clustering software by JohnZed · · Score: 2

    Get yourself a copy of Red Hat Advanced Server 2.1 or Kimberlite from Mission Critical Linux. At least read the clustering whitepapers on these two sites!

    Imagine the following scenario--
    - the node "owning" the disk hangs
    - the backup node takes over the connection and starts working

  59. Damn! didn't mean to hit submit by JohnZed · · Score: 2

    To finish the comment, imagine the following scenario:
    - node A, which owns the drive, hangs
    - node B takes over and starts writing
    - node A recovers from its hang, thinks it still owns the disk, and goes back to writing happily
    - you experience massive data corruption and get fired
    - the economy sucks so you can't find a new job with "corrupted critical files because of my cheap-ass attempt to share a SCSI drive" on your resume
    - your significant other leaves you, because he/she doesn't want to date an unemployed loser
    - you spend the rest of your life alone and friendless, wishing you had heeded my sage advice

    Clustering packages typically include features to STONITH (Shoot The Other Node In The Head) to prevent problems like this one. Red Hat Advanced Server and Kimberlite (on which RHAS clustering is based) include a number of other nice manageability features as well.

    Good luck!
    --JRZ

  60. Re:IEEE 1394a (Multiple-host File Systems) by Hungus · · Score: 1

    And according to the original question teh second machine is there for redundancy. so it should never use the drive if the first is functional.

    --
    Bad Panda! No Bamboo for you! In matters of importance ACs will not be responded to. Want to say something critical,OK
  61. We did this by garyebickford · · Score: 1

    Very long ago (1985), my team at Audre, Inc. (now Extr@ct built a SCSI-based system with the capabilities you desire. At the time, SCSI was new, and XENIX, a Unix-like system, was newish.

    We had 4 Intel 286 multibus boxes, each containing 6 graphics display controllers, connected to two disk servers, each of which had (IIRC) four SCSI drives. The data on both disk systems was identical. When all went well, any of the graphics systems requested data and either disk system would respond, providing faster response. If either one failed, the other one did all the work.

    We wrote our own SCSI drivers, and had the data (mapping vector and image data) striped physically on the disks to optimize sequential fetching when a user 'panned' across the map space.

    Unfortunately I can't provide much technical detail, in part because it's been a long time. It's doubtful if it would be useful anyway as SCSI has grown and changed a bit since then! However I believe it now allows multiple masters on the bus, which is the key to doing this stuff. I think we faked it somehow, I don't recall how. A big question is whether Linux drivers have this kind of capability.

    The primitiveness of Xenix provided an interesting advantage - we had a Unix development and operating environment, but had a very primitive timesharing method. This allowed us to 'take control' of the machine for significant amounts of time giving us a kind of pseudo-realtime capability for talking to the disks. This is similar in concept to some of the present-day Linux Realtime distributions(?)

    For those who are curious, this was all done for a 911 Emergency Response mapping system at Fairfax County, Virginia, deliverred in 1985 or 1986. Given a street address, we could figure out the location and present a 1Kx1K 8-bit color display of the incident area based on maps our system had scanned and vectorized, with several additional layers (the 8 bits were used as bit planes) for additional data such as police locations, within 7 seconds guaranteed with all 24 terminals in use.

    In practice response was typically 1.5 seconds. Not bad for a bunch of 286's with 2MB RAM - and umpty $K worth of display hardware - I think the boards were about $4000 each.

    Striping the data, which was organized in 256 or 512 pixel (I forget which) square patches, along with some fancy paging of these tiles into the 2Kx2K frame buffers, allowed the user to 'pan' vertically and horizontally across the entire 1400 sq. miles. of the county, seamlessly.

    Of course, once we had built and delivered the system, I was unable to convince the chairman of the company to attempt to sell this to anyone else and $700,000 worth of development time was essentially tossed and an entire market ignored. I left the company a few months later.

    --
    It's easier to be a result of the past, but more fun to be a cause of the future! http://www.spacefinancegroup.com/
  62. Re:IEEE 1394a (Multiple-host File Systems) by groovemaneuver · · Score: 1

    From what I've understood from the other posts on this topic, the biggest problem with the scenario you mention seems to be when a primary machine hangs but mysteriously recovers.

    When the primary hangs, the secondary takes over (figuring that the primary is down) and mounts the drive. But when the primary recovers from the hang, it still has the drive mounted, so both systems have the drive mounted.

    I believe they call that a "Bad Thing" (or massive FS corruption -- your choice).

    There seems to be a bunch of ways around this scenario -- GFS/openGFS, STONITH switches, etc. -- but this is why it's not such a good idea to just let the secondary take over without being totally sure that the primary is DEAD.

  63. Note to cliff: by tomhudson · · Score: 2
    Why are many of the recent "Ask Slashdot" posts not just ignorant, but downright stupid?

    This one is an extreme case in point.

    This is NOT off-topic, nor is it flame bait. Too many "ask slashdot" topics are themselves redundant.

  64. Re:Ahork by tomhudson · · Score: 2
    "hork": slang - to upchuck, throw up, barf, retch, puke.

    Also - horking ugly - ugly enough to make you want to puke.

    Been used in Canada for over 40 years.