Server Redundancy for a Small Business?
SadPenguin asks: "I am currently working for a small company of about 15 people each with one to two workstation/laptop machines a piece. We are looking for a new server solution, as our last one crashed, and lacking any server redundancy, we nearly lost all of our data since our last backup (it was only a few days, but an important few). What the kind of server (and redundancy) solution would be appropriate for a company of my size? Most advertisements are for large scale enterprise serving solutions, but these are costly and excessive for my situation. I'm sure that there is a simple Redundant Server technology out there that is a bit less costly, but won't result in any downtime in the event of a motherboard component failing (like we faced this time when our mysterious surface soldered VRM failed). So what do you use? What should I use?"
I actually run a computer consulting firm specializing in small businesses. I'll outline some of the more common recommendations - with what I think is the most important first.
From my experience, the best approach is to layer your defenses. I'd REALLY recommend a UPS (I generally assume this is purchased with a server, but it isn't always) at very least. Your local power company is only required to provide you with something CLOSE to 120v. They generally can't keep it consistent enough for power supplies (and electronic componentry in general). Protect your investment, UPSes are generally relatively cheap.
The fact that you've got a backup solution is good, but (as you've seen) not enough. Evaluate it, and see if it's consistent with best practices - i.e., is it a tape (or optical) backup system that is done in rotation and taken offsite by somebody in the company? If not, set that in motion first.
Next, some sort of drive redundancy is in order. At very least, mirror your drives. I generally recommend RAID5 (or one of its variants), but in very small companies RAID5 isn't either required or affordable or both. IMO, the jury's still out on the long-term viability of IDE RAID, but I think it looks promising.
Finally, redundant power supplies and NICs (for those of us that are REALLY paranoid ;) ). I've had a couple of servers' power supplies die on me, but the server kept right on ticking thanks to a redundant unit.
If it's affordable to your company, consider hot-swappable server components, as well. This significantly reduces downtime to your coworkers... and expense to your company.
Hope this helps. Good luck!
Oh yeah, FP ;)
We all get along together like tornadoes and trailer parks.
>> we nearly lost all of our data since our last backup (it was only a few days, but an important few)
Daily backups !
general recomendations:
quality server (Dell/HP/etc)
NO ide drives!
SCSI in software raid5
minimum software install (e.g. no compilers)
get second 'devel' server to test/compile software before using on production server
If it is not broken, don't fix it. as in screw with the devel server.
Christopher McCrory "The guy that keeps the servers running" chrismcc@gmail.com http://www.pricegrabber.com
I work for a small company that only has three full-time employees (including me). I use two Debian boxes (cheap-o machines that are just retired desktops with some big cheap IDE hard drives in them) running Samba. I use the rsync mirroring technique I found here.
One box is the "live" server and the other mirrors the live server every night. If the main server dies (which happened once - power supply failure), I can "promote" the backup server by changing one line in its Samba configuration. As a bonus, the backup server keeps "snapshots" back a week or two.
I do three types of redundancy/backup at my sites:
* Mirrored Raid in all servers
* A regular workstation with a good, large had drive that copies the server data to itself nightly
* A DVD-RW backup made nightly on yet another workstation, with at least one off site - 5 discs, one each weeknight, replaced a few times a year.
In most cases the server RAID (cheap ATA promise controllers) takes care of 90% of the problems - only one HD goes bad at a time, lightning strikes rarely take out the hard drives at all, nevermind both hard drives, etc. Even if it dies it's unlikely that the problem affected the HD backup on the other workstation, and it definitely didn't affect the cd-rw.
However, whenever you get a catastrophic failure in any component in the server, replace the entire thing. If the MB or power supply fails, copy the data to new hard drives, and use the old ones in less critical applications, etc.
Much cheaper than an 'enterprise' solution, and it should be because your application doesn't require such a solution. Use large tape drives in place of the dvd-rw if you must back up a huge amount of data on a nightly basis.
This sort of solution is very tolerant of cheap hardware, so replacing the server later may not be such a major cost.
-Adam
This is a hard problem(NP-Hard perhaps, I'm not sure), and you need to have a:
List of applications you want to protect
Budgeted amount
What threats you are trying to protect from
What kind of failures you will tolerate(do you need 99.9% uptime? or better? worse?
You could, for simple applications, like web service, bump up a pair of linux machines, gimmick some replication between the two, and hope nothing goes wrong, if you have a very low budget, and you'd probably spend a fair amount of work debugging later on, "synchronisation problems". But for redundant storage. The openssi project is working on highly-available single-image clusters for linux, in an open source model, they might be your first place to look. It's not however, something for the unprepared to do, nor is it something that I'd recommend if you do other tasks for this company. Maintaining such a beast will require a significant implantation investment. The good news is that once everything works to your satisfaction, you can probably take a 4 week vacation somewhere with golden beaches and much sun, and let it take care of itself. I can't stress this enough, this is a hard problem, if you really want to do this right, you'll want to surround yourself with qualified people with experience in this field, it's non-trivial, and mistakes can lead to severe data-loss.
At my place of work (18 people) I have set up spare low end machine (p233) with a 80gb drive as a backup file server. During the day every 15 minutes everything that has changed is copied to the backup server. The backup fileserver is configured as read only so a user cannot accidently change anything.
If the main fileserver goes down I simply change the configuration to read/write and change filemaping on the users machine and they continue to work. The whole process will take about 10 minutes to reconfigure the server and a couple of minutes per user machine.
As a bonus I dont delete the intermediate versions of changes files as I update the server. Instead I compress them with a unique filenames. So I can recover a fairly complete history of any given file. I have yet to fill up the 80gb drive so I havent needed to delete any backups. When the backup drive is full I will start deleting some of the older version, I should have room for about 6 to 9 months of backups at 15 minute intervals.
I've been a system admin for a production webserver for a few years now, and I can tell you this.
99.9% of the time when I've had to retreive data from backup, it was because of human error. I.E. someone deleted something they shouldn't have, or the moved the wrong directory to the wrong place, or an error was made during a software upgrade, etc..
the rest is due to random harware failure which would be a reason for using RAID. But pouring thousands into redundant servers and disks, is overkill for a biz your size.
If someone accidently wipes out a folder or data, your raid disks won't be any help.
Love,
Zaq
We just finished building a 2.5 TB (terabyte) server for less than 5000$. You could probably spend even less than that since we spend about 1000$ on two fiberoptic cards. We have 2 6 chanel 3ware RAID cards and 12 250 133ATA Maxtors hooked up to a 520 watt powersupply plus another 520 watt power supply acting as redudant power(we did that mod inhouse). 2.5 TB is probably more than you guys will need unless you are doing some advertising or something like that... so you could probably go for 1 TB, which will cut your costs down even more. So all in all you could probably get it done in about 3000$ not too shabby for 16 ppl. Our server backs up my whole college.
I too have long experience doing small business consulting and in some other areas. One thing you could do is use RAID-1 with a spare drive. That way if you lose one, you aren't screwed. You also could have a couple spare drives in hot-swap carriers. Pull a drive every night and have a duplicate of your server. Fire up the duplicate server and pop in your known good pull and boot if you server fails.
OS dependent, you don't even have to have exactly the same hardware if you use a more generic kernel build and you can list a different NIC for the spare server in the conf file for modules assuming you aren't compiling them into the kernel.
Continue with good backups made to another machine, to tape/CD/hard drive, or off-site. This way, even if your good pulled drive is a little out of date, you can bring to data current in short order.
You don't mention the OS of the server or budget, but I'll assume that since you've got 2 machines per desk time 15, you can afford a spare server. You don't mention OS and that affects cost, but still, if you are doubling up on hardware on desktops, you can afford to do this or most any of the other solutions offered.
Of course, you get what you pay for and if the experience is lacking in house, hire a knowledgeable consultant or company you trust to do it for you.
a relatively cheap setup for data/service redundancy for a small business.
* two identical servers, running linux (of course).
* heartbeat
* drbd
* two UPS
Notes, Ins, Outs and What Have You's
service redundancy
heartbeat is used to make 2 servers look as if they were one. if one of the servers dies, heartbeat makes sure the other assumes the ip address and has all the relevant services started.
data redundancy
drbd is a network block device. again, it looks like one device, but when data is written to it, its actually being written to 2 seperate locations. if one box goes down, heartbeat makes sure drbd makes the other box primary.
hardware
these two call for a dedicated network and serial connection. so 2 nics and a serial port per box.
definitely raid array of some sort.
see drbd.org for more details.
this is not a 100% proof setup, but its cheap and covers most of the bases.
of course, it requires a linux dude to get it all to work.
you may benefit from a combination of heartbeat and DRBD, which respectively provide IP address/service failover and a network (no special hardware required) data replication solution.
If you have appropriate hardware you might also appreciate Stonith, which provides forced-shutdown of a failed node (in the case that the failed node won't release the IP address, and hence you would otherwise have problems switching service).
If you're in the UK then give me a shout and I'll set it up for you (for a reasonable fee)! My contact details are available on my web site.
Yours Sincerely, Michael.
.. get people into the habit of running CVS or Subversion client on "their documents" folders. Tortoise integrates right into Windows explorer. Advantages: file versioning, ability to work off line and still sync with the server later, etc.
if people actually work with plain text docs, they would love how CVS,etc will merge multiple users' changes.
Of course you would back up your CVS server but in case of a crash, chances are that very important file can be found on the desktop of the user who edited it the last time. Much better than relying on a network drive and then it is just not there.
I need to point out that your selection criteria should include multiple firewire ports, and firewire controllers on both the drive and ant the server end. Should add only marginal cost to your setup.
I have a maxtor FW single HDD backup solution, but I definitely would not recommend that particular one for constant on situation (for lack of ventillation). It seems that when the drive does the temp calibration the FW insterface hiccups, and the ongoing transfer gets interupted. All is well after diconnection and reconnection. I only notice it when I am doing unison on very large directories (30+GB), but if you would serve files off of it, you might be getting into trouble as well.
Other issue with FW is bandwitdh. I am getting about 40-50MBps which is enough for sustained transfers, but the drive would be capable to 100MBps in burst (short files cached on the drive). This might be detrement to the file server performance.
I do have to say that I quite like the idea of having 4-6 external FW (or better FW800) disks hooked up, and running a virtual RAID5. This way the failed disk would be very easily hot-swapped. And might even be much cheaper than having hot-swap support backplane/chassy/server.
Code poet, espresso fiend, starter upper.
It's already been mentioned a little, but a second server kept up to date with rsync may be a cheap way to go depending on how big your server is. While I don't know how much data you are talking about, I would expect rsync could sync a few times a day easily via a cron job.
I would suggest springing an extra $90 to get two extra gigabit ethernet cards and a crossover cable for a dedicated connection for rsync which doesn't compete with office traffic.
Using rsync as a basis, the solution could be made as low tech and simple or automated complex as you feel is needed.
-Pete
Do woodworking? 50 Router Bits
Soccer Goal Plans
If you are using Windows 2000/2003, an easy redundant file serving solution is to setup DFS (distributed file system). Just a tip, don't setup a domain-wide share for a file server that gets a lot of updates. Using DFS like that can create an administrative nightmare (last writer wins situation). You would want to use a domain-wide share if you have a lot of read-only files (like installation files, PDF image archives, etc) and you need a high-availability solution. You would be restoring files from tape a lot. Anyhoo, if your first server crashes, temporarily redirect your users to the second server either via DNS or just renaming the servers. DFS doesn't replicate printers, so you would have to install a new printer two times, once on the first server and a second time on your second server. Shouldn't be too much a problem if you only have 15 users.
If you are using Linux/UNIX/*BSD, you could use Rsync. There was a great article explaining Rsync usage in the June '04 print edition of SysAdmin.
While this _should_ be a great business opportunity, I think you'd find that small businesses pose some interesting challenges:
* Small business owners are CHEAP. They don't want to spend a nickel on something that isn't an immediate problem.
* They don't see the value in disaster recovery until they experience the disaster.
* They are hard to sell and market to.
* They often use horrible niche-market server based solutions that are Windows only.
I spent a few weeks talking to various business owners about a solution that would offer the following:
* Redundancy, in many of the same ways discussed here,
* Security: firewall, antivirus, antispam
* Offsite backup and admin
* Four hour replacement
* Other stuff, potentially, like ad blocking, web whitelists/blacklists, fax server, email server, etc.
The price to do this for a small business would have to be at least $250/month. They won't spend it on something that they see as intangible. This is the reaction, even considering that at least $200 a month is spent by them in man-hours to have someone, often the owner, wrestling with the cheapo Windows server that they're using. Keep in mind that the $250 would include DSL connectivity AND the hardware for the box.
Jonathan