Ask Slashdot: Low Cost Way To Maximize SQL Server Uptime?
jdray writes "My wife and I own a mid-sized restaurant with a couple of Point of Sale (POS) terminals. The software, which runs on Windows and .NET, uses SQL Server on the back end. With an upgrade to the next major release of the software imminent, I'm considering upgrading the infrastructure it runs on to better ensure uptime (we're open seven days a week). We can't afford several thousand dollars' worth of server infrastructure (two cluster nodes and some shared storage, or some such), so I thought I'd ask Slashdot for some suggestions on enabling maximum uptime. I considered a single server node running VMWare with a limp-mode failover to a VMWare instance on a desktop, but I'm not sure how to set up a monitoring infrastructure to automate that, and manual failover isn't much of an option with non-tech staff. What suggestions do you have?"
...don't upgrade.
Why don't you have good uptime to begin with? I've SQL Server 2005 on a single unimpressive physical server with months of uptime. Is your restaurant open 24 hours? Is your current server flaking out? Concerns about uptime itself might be misplaced. What isn't made clear in the OP is why you think you need better uptime.
Seriously put the SQL server up in Clound service and let them worry about it. If its a Microsoft SQL server then Azure is the place to be. Hell put a full instance of your service up in the cloud.
You may want to look at what they are doing with avalibility groups. You can avoid the shared storage with avaliblity groups and could cut your hardware costs a bit. assuming you your software support SQL 2012. Link http://msdn.microsoft.com/en-us/library/ff877884.aspx
You are probably looking at failover clustering with Node and File share majority here (http://technet.microsoft.com/en-us/library/bb676490(EXCHG.80).aspx).
If you smb file share is not reliable, you should be using transaction log shipping, it's the cheap man's failover (http://msdn.microsoft.com/en-us/library/ms191233.aspx).
Well I suggest immediately switching to Linux and MySQL - duh!
You are missing some critical details about this. How many transactions with this database are going per minute/hour/day? If this is a fairly basic SQL instance, I don't see the point of your fail over scenario. Simply create some jobs to run backups every few hour or half-hour (storing transaction logs and such) and roll that over to the "desktop" on a share or something. Obviously money is the issue for you, so don't make it so complicated you can't afford someone else to come in and fix things if you get stuck during production.
have you considered alternatives like:
- Amazon S3 (virtual)
- Azure Cloud (virtual or sql-node upto 50gb)
- Google App..something
They all, claim, to have 99.99% uptime or more.
if you wish to minimise infra cost + maintenance then EC2 + RDS might be a good option for you. If you have existed MSFT solution then migration to Azure platform is another option. I would prefer to go with EC2 but this is personal pereferenxe.
"I thought I'd ask Slashdot for some suggestions on enabling maximum uptime" - the answer: YOU don't enable maximum uptime, you get someone to do do it for you.
It seems to me like you are ill-equiped to handle server hosting. If you're asking these kinds of questions it may honestly be wiser to just let a third party handle this task for you, so you can handle your main job: serving customers. How about you host your SQL database at one of the many hosting providers willing to tackle this for you? I bet a simple Google query would result in thousands of hits. But as seems to be the trend with Ask Slashdot lately: you probably didn't want to hear this answer.
http://bit.ly/LaHOeu (I know how Slashdot frowns upon URL shortners, but I assure you, the person asking the question definitely needs to learn that Slashdot isn't a damn search engine).
"a mid-sized restaurant ... We can't afford several thousand dollars ... enabling maximum uptime ... with non-tech staff."
You don't need HA. You need a pencil and paper.
Just checking as the offline modes of some allow for continuity of business if the server goes offline making it not as big an issue requiring failover, clustering, etc. Micros, Aloha, Focus?
How much money do you lose if and when it does go down? How many times does it take for $5-10k (or whatever for a couple servers and storage, power backup, etc) to pay for itself? How much do you pay the guy who keeps it up? Or do you run back and fix it when it goes down?
If it takes years to pay itself off, then it doesn't really matter ... your servers don't go down enough, your time isn't worth enough, or the technology isn't benefitting you enough in the first place. If it only takes a few times, why hedge? And if you're working on razor-thin margins, maybe you should consider why you're spending even as much time and money on this solution in the first place... especially if it's costing you money when it goes down (presumably frequently, or why bother asking).
When people talk about redundancy they usually aim for the highest uptime they can think of without considering if they really need that.
If you're a mid-sized restaurant and can't afford the big-boy's solution to redundancy, just make sure you have disk mirroring and backup's ready. Keep a spare machine installed with the same version of Windows, .NET and SQL Server and document what needs to be done to move data over the dead machine's disk onto the new one. Is it that much data? Do you need it all to keep the restaurant running? Can you bootstrap with enough data only and leave the sync'ing for when you're closed?
It's business requirements first, technology second.
Could we please stop ceding generic terms like "SQL Server" to Microsoft? Oracle produces an SQL server, as does IBM (DB2), as do several other companies and open source organizations. Why does "Microsoft SQL Server" get to be "SQL Server"? Isn't it bad enough that we already given Microsoft the "Windows" name (how old is X Windows?) and "PC" has morphed from meaning "personal computer" to something that runs Microsoft Windows?
Get a decent server, maybe an HP. Dual CPUs, Dual HDDs, Dual Power supplies. Get a UPS.
Install Windows, SQL, and UPS controlling software. Install AV, but be certain to exclude AV scanning the SQL directory and SQL DBs and logs. You don't want AV killing your SQL server by accident. You might want to consider putting a firewall on the box and blocking all non-SQL traffic.
Patch as needed.
Install nothing else. No mine-sweeper, no restaurant food ordering software, no adobe. Nothing will kill a server faster than turning it into a desktop. Don't try to do anything on it. Just let it be a server running SQL and you'll be fine. Don't plug USB drives into it.
You should be able to back up the SQL db every so often stopping SQL and then starting it. Try to do this around the monthly patch cycle. Don't patch immediately upon one becoming available, but rather wait a week. This will give Microsoft time to correct any patch issues they have. You'll be much more vulnerable to patch issues than you will from viruses if you follow the "don't turn it into a desktop" suggestion.
Hoist Number One and Number Six.
If you can't afford the infrastructure, rent a cloud server and charge it as recurring business expense.
Bow before me, for I am root.
Seriously, number one issue is quality hardware. If your power supply isn't regular, that's an issue. If you don't have proper ventilation and cooling? That's an issue.
Not enough information, you'll have something like failure possibilities in: the physical server, the VMs, the SQL Server instance, the Hard Disks, the hypervisor, the POS application, the queries, the backup process, the restore process, etc. All of them have tried and trusted solutions, but you need to establish what you're tackling first. If the answer is 'all of them' and you don't want to break it down and think about each item then you're better off pushing the problem out to an expert to manage it for you, you can take a range of hosted solutions or get someone to remote or local manage your infrastructure for you.
Need automatic failover? Use your two hosts to build a 2-node cluster and buy a simple NAS for iSCSI shared storage between them. They are usually cheap.
You could also skip the failover worries by setting up an active-active SQL cluster using hyper-v.
http://blogs.technet.com/b/meamcs/archive/2012/04/10/creating-an-active-active-sql-cluster-using-hyper-v-part1-virtualized-storage.aspx
If you build the servers yourself, you can keep the costs under $1k. (xeon 1230, asus P8B-M, iscsi capable nas)
MySQL is not web scale. He should use MondoDB. That is web scale.
restaurant food ordering software or even time clock software may need a backend sever for there own data bases or some stuff like the food system may tie into the main data base.
While being a fan of pg myself I have read that some SQL Server admins have grown very fond of new PostgreSQL features. Master-slave replication might be what you need, though I haven't ever had a problem with pg not staying up (in contrast to my own applications).
Here's a nice intro/demo/presentation by Rob Conery: Five Things You Didn’t Know About PostgresSQL.
No idea how could you convert your installation into working with pg (all I know about sql server syntax that it's very ugly compared to what I consider standard SQL; microsofty language extensions etc), but you could find some better POS software which would support pg, hopefully making you some savings in the process with less licensing fees.
You can run MySQL Cluster on two machines. It's somewhat complex to set up. And your POS terminals have to be able to connect to either server. But it's available.
If you're getting more than one crash a year, you have hardware problems. Commodity hardware may be unsuitable for a restaurant environment. You may need an industrial-grade PC, with a broad operating temperature range and resistance to dirt, dust, grease, and water. There are PCs and enclosures for restaurants, and the fast-food industry uses them extensively. Every McDonalds, Burger King, and KFC outlet uses industrial-quality POS systems.
You wouldn't use a home-quality stove or a home-quality coffee maker in a restaurant. It wouldn't hold up. The same goes for a computer.
http://www.whentomanage.com/
Stop. Whoever makes the POS software is the expert that you want to consult. Call up their support and ask them what they recommend.
I would not immediately assume that moving it somewhere else will increase uptime; it puts uptime requirements on the Internet link(s) instead of on the server or software setup. Unless the present setup is quite unreliable or he has a surprisingly good link, I think that would likely be a worse problem.
Now, the idea that you can't afford multiple server nodes: Servers can be very, very cheap. For my home server I use an Acer Revo 3600 I paid 200 euro for; the closest available today seems to be http://www.amazon.com/Acer-VN281-2G-320-Linex/dp/B005WUXW1C (at about $220 including shipping.) Assuming you don't have a license cost problem, this allows you to create a cluster for a very low cost.
Apart from that, I'd analyse what your costs are for a failure, and what the odds of a failure are, and whether your tinkering increase or decrease the odds. I'd assume the odds were fairly small to start with; in that case, it may not make any sense to tinker with the setup to create something that is supposed to be more available. I've easily had several years of uptime on single systems; introducing complexity makes that harder, and if you lack the experience with how to deal with these systems, that's likely to increase the risk. (What happens if somebody start your failover by mistake? What happens if both instances are running? etc)
For your particular use case, it sounds like I'd rather have a good alternative system for handling it if your system fails (pen and paper sounds good), and try to beef up the single machine - place it somewhere it won't have dust, vibration and heat problems, use multiple network cards to avoid risk of cable failure, use reliable disks & RAID, have a good UPS with monitoring, etc.
Doubting the existence of evolution is like doubting the existence of China: It just shows that you're uninformed.
If you can't measure it, you can't manage it. You haven't taken the first and most essential step in analyzing your problem: measuring it. Is your problem caused by network failure? By power? By software failure? Hardware? If hardware, by server hardware, disks, or something else?
If software, by OS, database, or application software? All of these have different solutions. Going to the cloud won't solve a network failure, it will make things worse. Going to the cloud may improve persistent hardware failures. but the MTBF of most decent hardware is pretty good, so are you sure you have clean power and a good (cool, clean) environment?
If your software or system is crashing, then that's its own problem.
First, getting everything into a VM environment is a huge win as far as manageability as it buys you hardware agnosticism.
The product description for VMware HA is exactly what you are talking about, but you do need a license to use it. I think you can get a barebones license for about $2k, which is expensive for a small shop, but it does buy you automated failover.
There are plenty of desktop class machines out there that are compatible with VMware, so assuming your hardware requirements aren't too high you could buy two of them and a license and some shared storage and still be under $5k, which IMO is very reasonable for a small business wanting failover. Just keep in mind that your shared storage then becomes a SPOF so you don't want to go with openfiler running on an old PC with a single SATA drive.
Another option could potentially be putting the server in the cloud, assuming you are not bandwidth constrained and your Internet connection is stable. In the cloud you just pay for the server and let them worry about uptime. On the back end, they have massive redundancy and your VM would recover from a failure inside 10-15 minutes.
This is a shameless plug for my companies software...
http://www.storagecraft.com
This will keep the entire machine (desktop or server) protected with great backup; with a downtime of a few minutes. In the worst case if you have to go buy a new machine to run it on it will restore to dissmiliar hardware. Or, even to a VM. And, no need to take SQL down to backup. It takes care of all of that.
It will keep the machine safe and ready with a downtime of minutes, not hours or days.
Yes, what you want might cost a few thousand, but you can easily lease that instead of an outright purchase, and spread it over 2-3 years with no problem. Do it right, you will be happier.
Have you really ever had problems with SQL Server crashing before? What version? What kinds of workloads? Did you tell Microsoft?
Don't bother with clustering, use Mirroring or AlwaysOn Groups instead.
Backups with a powershell restore script can be pretty easy to use once you get them setup. As always, if you go the "backup" method, make sure you test your backups sometimes and your full restore procedure to make sure it works. It's not fun to find a problem with them when it really matters.
Replication is a pain to deal with. You could just use a failover cluster.
I don't have time to make a sig
LOL
MySQL is shit compared to SQL Server
... instead of trying to do it on the cheap through slashdot?
If you are not willing to put the money into the infrastructure, you are not going to get the infrastructure that you would have if you had put the money into it. There is no magic secret sauce that IT people have that turns low budget implementations that operate the same as well thought out, planned, paid for and implemented infrastructure. In other words, baring any greatness or incompetence of IT skills, you get what you pay for.
Plus, when you look at these infrastructure problems, don't look at it as how much money are you willing to spend, but how much you are not going to lose if your infrastructure is down. Make sure to be honest with how much it costs you if you are down and how often is it down? Through all of this, unless you are using low quality hardware, I am willing to bet that any downtime is caused by lack of power or the software failing.
Linux O Muerte!
I have never seen SQL fail all by itself. In my experience, by far the most likely point of failure is the hard drives.
I've seen pretty nifty windows based POS systems that had the SQL server start on each POS, look for a master and if there was one already, try and be slave. If that failed, it would just fail and try to become slave every few minutes. Apart from that, the POS software would just connect to the master-of-that-moment. Once the master went down, the slave would promote itself and the fist POS that tried then became the slave. All POS terminals that were up would constantly replicate database files/dumps/backups whatever so they would never be too far behind the master. I don't know how this mechanism worked exactly, but it was pretty resilient against little restaurant accidents and power glitches.
I Wouldn't want to directly advise to go cloud. If your uplink dies, so does your POS system. You could put backups in the cloud to prevent theft or arson destroying your accounting and books, but I'd not trust a single uplink high latency service with your primary business myself. Even if you get business quality lines with a proper SLA, you still can be down for half a day easily and pay hundreds of dollars per month for just the single uplink. Getting two independent uplinks with this kind of SLA will be so prohibitively expensive that you could easily afford to do your own cluster for that kind of money.
I was promised a flying car. Where is my flying car?
Database Mirroring does not require shared storage and is fairly simple to implement. You just need a second instance of SQL Server installed somewhere.
My other signature is a car
Firstly, you need to make sure there is a paper process which people can run by if the kit fails. Business continuity doesn't always require a massive DR strategy especially in your market area. If the kit does go pop, your staff need to be able to work instantly - paper is the best and did for years before computers came along.
However from a technical POV, speaking from 15 years experience running SQL Server instances, there's no cheap solution that works reliably. Hot standby (HA mirror) is the best approach for your scenario. It's probably a good idea to host it in another DC to isolate larger failures.
Info here: http://msdn.microsoft.com/en-us/library/bb934127(SQL.105).aspx
If I were writing a POS system for anybody smaller than Hardees/Carl's Jr, I'd be using SQL Express. There aren't a lot of sweet options for HA there. At the end of the day, what does your vendor suggest/support? I'd go with that.
Postgre.
A couple cheap desktops ($500 each), raid 1 nas storage device (~$500), single drive backup for the NAS (~$200), and 2 gbps switches (~$30 each) you could have the bones for a low budget ha environment.
Yes, get some decent hardware that won't give you too much trouble -- but equally important: Set up procedures to ensure that when your database is down, you can still get work done. Test those procedures periodically; make sure your staff can run the restaurant when the system is down.
Hardware fails. Software fails. Unless you're willing to spend lots (and you've said you aren't), you're not going to build and test something ultra-reliable. You don't want your entire business on hold (with a restaurant full of customers) because some part of your POS has decided to crap itself.
I was the Program Manager in the SQL Server team owning all of our availability products in Redmond (created the AlwaysOn program). My recommendation is to 1) keep it simple and 2) implement a layered approach. But first I have a basic question - are you trying to protect SQL Server only or do you also need to protect that application and hardware? Because before we dive into the details, it might make sense to take an entirely different approach than the lower level availability technologies. And you also want to consider what it is you are protecting against. If you want to protect against hackers as well as power outages, disk failures, etc (and I suppose you probably will want to do that!), then I recommend that the first thing you do is perform regular backups to a cloud provider. That gives you the ability to restore to a point in time prior to a malicious attack. And it gives you defense in depth. Then, if you want to protect the app overall, maybe you should consider making it something that can also be hosted in the cloud for the next layer of redundancy. That way if you completely lose the site you can direct people to the cloud enabled app. But this also retains the ability to run it locally as the POS solution. Next, I'd consider a way to keep the data synchronized between the POS local installation and the cloud solution running in the VM. The cheapest solution is to use log shipping which performs backup and restores into the secondary (here in the cloud). This is also nice since you need the backups anyway for the first reason stated and this automates it. You should consider using database mirroring (now called AlwaysOn in the latest incarnation in SQL Server 2012) for the data synchronization. It's integrated into the SQL Server engine and provides better performance and the ability to configure it for no data loss and auto failover using the synchronous option. It comes in Standard Edition (sync only) and Enterprise Edition (async and sync). Also cover yourself for the common failures locally. Use a battery backed UPS and consider RAID for your disks on the computer. RAID 5 is probably fine for POS. If you have any other questions, feel free to email me johnmatthewhollingsworth@gmail.com. Best. Matt
I recently inherited an application built on .NET, we're a Linux organisation. The devs had typically built it on SQL Server Express with not a care in the world, but it was a core business app.
We bought a single license of SQL Server standard, and put it in Master Slave replication mode. Not having touched Windows for years (as a server) it was a bit of faff to get an Active Directory setup going, but it actually works okay. You don't need to license the failover server for SQL Server.
If there's a failure, it's about 10 minutes on notification to flip the servers over and a bit of manual intervention. You can cut this down by buying a third box to use as an observer, but that seems to be another SPF.
I've worked with large (3000+ servers) IT environments, and small (50 servers) environments. I'm used to a typical server from Dell or HP, plus VMware licnesing and MS licensing, SAN storage, offsite tape backup, remote mirroring, large battery UPS, diesel generator, Liebert CRAC, command-center monitoring, etc. to end up costing many thousands of dollars. For a large corporation, it's worth it. But for a really small shop (eg, like this one, needing only a single server supporting a single db, maybe some email and web hosting...) I have to wonder if cloud services wouldn't be the way to go. Certainly, I know that clouds have gone down (Apple, Amazon and Microsoft at various times) but surely the uptime is greater than a server in the back room of a restaurant? I just checked MS's website and see that a 2 GB Azure database is only $13.99/mo, and 52 GB of data transfer is $6.24/mo. That's pretty cheap for redundancy, backups, load balancing, etc. (I have no idea how big, and how much bandwidth a POS db requires, sorry.)
First of all whatever you do don't listen to those saying "to the cloud". When your ISP/Internet is not working it means your POS is not working which means your business is fucked -- at the total mercy of a working Internet connection. Most POS software is not optimized to minimize round trips either so expect pushing database to "the cloud" to be much slower.
The lowest cost easy to configure and manage approach is to setup a periodic backup task to backup entire database every few minutes and copy to the desktop machine, a usb stick..etc. With this you can quickly restore the backup file to sql server on the desktop if needed.
There are cooler and more automated approaches but they normally require another server or two.
With this approach you tolerate xx minutes average of lost data on failure and must take manual action to recover but copying the backup files.. is easy and cheap. Alternatly you can do differential backups of the transaction log if your database is big enough to where it would matter and tolerate many less minutes of dataloss.
Other approaches set up log shipping to a read only and promote the read only manually on failure. It requires two servers.
Database mirroring between database servers is another solution and it is awesome but requires three servers. You don't loose any transactions and it is 100% automatic. If the POS can reconnect to the database when the connection drops it will be able to failover.
Obviously having redundancy at the system level (mirrored disks) is important too. Having a transaction log and not just the "simple" recovery model is just as or even more important as it allows restore to point in time if your POS goes haywire, sabatoged by a disgruntled employee or the vendor botches an upgrade. Not all failures are system/physical, recognizing this before you learn the hardway.
Without thousands of dollars on fancy shared storage infustructure virtual machines do nothing to help you increase database uptime. They increase your resource requirements and restore time.
You can migrate vm's to your hearts content but the data contained within is all that really matters here. Migrating a transaction log or backup is easier, faster and less resource consuming than migrating an entire virtual machine containing the same or any differential scheme at the vm level.
Summary my advice if you don't want to spend money on more servers forget about vmware. Install sql server on the other (desktop) system and setup a script to periodically copy backups to the workstation so you will have it if you ever need it but see above comments.
Frankly, if you don't value your uptime worth x why bother? There is no magical pill and if your current infrastructure fails to handle existing load there's nothing you can prudently do without shelling out additional money (or time - which IS money).
If it all seems to work all right, just make sure you back-up and monitor your existing hardware for possible failures.
Nitrosphere has some tools that are designed to help with SQL Server including down time.
http://www.nitrosphere.net/
From the OP's post: .NET and MS SQLserver on the backend.
- mid sized restraunt with 2 POS terminals.
- Windows-based POS software which uses
- Upgrade coming up, which is prompting the question of how to make more fault tolerant.
So, some un-answered questions/information needed:
- What is your current setup? How much hardware do you have right now?
- What failures have you seen? What have you done to address that right now?
A common mistake when people use VMware, or virtualization of any kind, to mitigate failure is to put it all on one box.
Assuming that you have the following:
- 2 x POS terminals running the POS client software connecting to the LAN
- 1 x MS SQL server
- 1 x POS server software (potentially installed and running on the same server as the MS SQL server)
So, depending on what you want to mitigate, you can take several approaches.
If the MS SQL server software/application is what has been crashing, you need to determine why it is crashing before trying to architect around stability factors. Ie, is it crashing because of the database server running out of memory? out of disk space? over heating? just segfaulting? Is the box fully patched up? Could it be infected? Could the hardware be faulty?
If the physical box has been crashing, you need to find out why it is crashing. Unreliable hardware? Overheating? Power issues?
Basically, knowing where the instability originates from should be your first concern. Mitigation originates from there.
So, assuming the hardware is fine, the software is running in a stable fashion,the POS software works fine, etc. And there aren't actually issues, but you want to head off problems. In which case:
Get a new box which will comfortably run both the database and the POS server software. Considering this is a restraunt with only 2 POS terminals, this should NOT be a problem, but then again, MS always requires more resources.
Suggested hardware:
2 servers:
- 4 cores 2Ghz+
- 16GB memory (presumes 6GB of memory per virtual machine, get 32GB if you want to go more. memory is cheap).
- mirrored raid (software or hardware) of at least two drives large enough to support the operating systems, software, and database growth.
You can either install ESXi on both servers and then build VM(s) atop of them. ESXi will let you have a rock solid platform to run the VM(s) on. Though you will need to have additional drives, one set for the ESXi OS and one set for the datastore. The other option is to run Windows as the base OS and then install VMware Server on top. This is less efficient, but might be easier for you to manage.
Server #1:
- POS server software VM (4GB) (active)
- MS SQL Server Database VM (8GB) (primary / master )
Server #2:
- POS server software VM (4GB) (inactive, a ready-to-run snapshot copy of the active copy, made periodically)
- MS SQL Server Database VM (8GB) (secondary / slave / replicated from master )
In this manner, if the POS server dies, you can stop that VM and start up the other one. Or, you can just bounce the failed POS VM as the reboots can be fast, compared to bare metal reboots.
If the POS server software can handle primary/secondary database connections, then you can have the two MS SQL Server database VM(s) setup in master/slave mode.
Periodic backups between the two physical servers would be ideal. You can have a backup VM on each server and have the databases backup to the other server.
Note, proper setup of this kind of environment would require someone who understood the MS SQL Server, your POS server software, and virtualization. You could even setup a Linux VM on each server running Keepalived to share a virtual IP along with ha-proxy so that automatic fail-over can be done transparently.
Ultimately, it all depends on what fault issues you are trying to avoid. To make the most of your hardware, you should consider the following:
- trim down your Windows OS installation to reduce services and memory footprint.
- tune your MS SQL Server to use less memory and periodically tune/cleanup
It all comes down to what resources(people, hardware, software, etc) you have access to and are willing to commit.
I've also worked for numerous large restaurant chains in a consulting capacity and understand the unique needs here.
My suggestion within your parameters is to have two SQL instances going, and this can be on commodity hardware like desktop, and use transaction log shipping and DNS-based failover. You will have to script a utility to handle the database failover from DNS, however; but this is the poor man's solution.
However, the best option is to purchase shared storage (cheap -- a NAS appliance with an ISCSI target, so you can have shared storage QNAP, Synology, etc support this in their $150 bare enclosures) and use synchronous database mirroring in high safety mode . However, you will need to configure an Active Directory domain to get failover to be automated, and then place systems on that. Once that's done, you have an elegant failover solution that is fully automated. And cheap.
Note that Standard MS-SQL includes synchornous database mirroring, but Enterprise includes asynchronous -- which is overkill for your needs in my opinion.
You could install the free Hyper-V server on two desktops with multiple NIC's and get a live migration scenario set up. That would cost you a desktop (you will want the desktops to be very similar), two nics, and a small NAS with iSCSI support (e.g. Netgear ReadyNas or similar). I don't *think* such a setup would cost you any more in licensing since you already own them, but you should check. If you wait for Hyper-V 2012 you won't even need shared storage -- a network share will do.
The FIRST thing I would do would be to find some POS software where the terminals switch to standalone mode WHEN the server/network goes down (not IF). Then I would work on getting the server and network to a point where that is necessary less often.
We should start referring to processes which run in the background by their correct technical name... paenguins.
Restaurants POS systems aren't exactly cutting-edge technology. There's gotta be a kazillion commercial systems out there and lots of pros to install and manage them who have set up and maintained hundreds of these systems. There's probably even some sort of restaurant owner's organization that can recommend systems and consultants. Why are you spending time on a tech site wondering about how to roll your own POS infrastructure when it provides no competitive advantage to your business and any screwups could cost big money? Spend your time worrying and working on stuff that will actually help your business compete with all of the other restaurants out there. Not that I know much about the restaurant biz, but I'm going to guess that getting and keeping good cooking staff, waitstaff, etc, getting quality supplies at a good price, and marketing the place and other restaurant-y things belong much higher on the owner's worry list than what hardware and software the POS systems are using.
I don't reply to ACs
By this time you have realized that 98% of Slashdot posters don't know a damn thing about the issue that you need to resolve (... the cloud? .... Really ??? ) and are just flinging buzzwords (monkeys... poo ... ) or asking questions that you won;t know the answer to in hopes that this will get then off the hook in actually answering your question.
.
Short and sweet - you want database mirroring with automatic fallover. You can set up a second SQL server on a separate machine (cost less than $500 for the machine) to be the mirror and if your primary machine fails then you are still golden. Here is an article that tells gives you an idea as to how to do this in MS SQL '08 :
http://www.databasejournal.com/features/mssql/article.php/3828341/Database-Mirroring-in-SQL-Server-2008.htm
Yes, you should hire a **competent** DB consultant to do this for you. Yes it will cost you another $800 - $1000 do have a **competent** consultant do this for you (figure 8- 12 hours work at 80 bucks an hour) - you will lose far more than that the very first time your database fails and/or you attempt to do it yourself and blow away your database because you made a mistake (you do have backups , of course.... right ??? ).
You can try to do it yourself but I do not recommend it as it's risky.
I've been doing DB work for 25 years - feel free to send me a Slashdot message should you desire to use my services.
----- In Your Cubicle No One Can Hear You Scream...
You're overthinking the issue. Just build yourself a high-quality desktop with onboard SATA RAID capability, mirror or stripe a redundant array, then stick that setup on a good-quality UPS rated to handle your rig for at least half an hour. You'll have five nines, barring theft, vandalism, or acts of deities, and you can do it all for under $1k.
Lets just say your thinking of moving to Windows Server 2012 and running the non-Enterprise version for SQL 2008 or newer.
Your license allows you to do an active/passive cluster as long as the 2nd node is for fail over only. You need some kind of shared storage, but that can be ISCSI. IF you move to SQL 2012, it can be a file server (But they intend that your file server is highly available).
Hyper-V replication is an option with Server 2012. (Intended for offsite disaster recovery)
Simplest option
2 servers running Hyper-V. Run SQL in a guest VM that replicates from primary server to secondary server. Fail over is manual, but recovery time is minimal. The SQL guest is still just one OS, so patching and reboots need to be scheduled.
Simple Cluster using a VM as shared storage:
1 server running Hyper-V. Set up one VM as a iScsi target that hosts the data files. Make it core if possible. Then cluster SQL on 2 more VMs and use that iScsi target for shared storage. You can now fail over between the SQL guest automatically and use cluster aware updating. Don't forget you can add a 2nd server and replicate all 3 VMs for additional protection.
Storage spaces for shared storage:
The new storage spaces feature allows you to get much cheaper disk arrays. You can even just load up a server with a bunch of disks and use storage spaces to server it up over SMB of iSCSI.
So Server 2012 opens a few options for you. Open filer also adds that iscsi option and its open source.
If you get decent hardware (someone else has mentioned HP already but if you prefer Dell I wouldn't blame you too much). RAID yes, dual PSU maybe if it doesn't add hugely to the cost. Apart from a DOA, I've only known one server-class PSU die and that was in a box over 6 years old so I don't tend to bother now.
Have someone who knows what they're doing set it up, with all the little tweaks that an experienced person will do (or avoid).
Those two by themselves will get you to almost all the way to maximum uptime.
Add a copy of StorageCraft ShadowProtect or similar imaging software (if any such exist), a NAS box for it to save images to and have a decent workstation (plenty of RAM in particular) that you can install VirtualBox on. That way if the server dies you can mount a very recent image of the server in a VM as a temporary thing until you can fix the main box.
Not free but not as spendy as a multiple-server failover system and you can use the workstation for other things the rest of the time.
MySQL is not web scale. He should use MondoDB. That is web scale.
2011 called and wants it's advice back!!
Get with the times, he should use MemSQL
You are over thinking this. You really don't need VMWare in this case. Just use merge replication from your server to the desktop. If/when your server crashes, you will have an up to date copy of your database and can easily go on from there. The process is rather simple and you don't need a full copy of SQL on your desktop - the free SQL express version will do just fine. This is how I keep an up-to-date copy of all of my remote servers [115+] and they all replicate to the corporate servers that are kept on VMWare boxes [again, something you don't really need to do]. A free solution using the tools you already have!
I administrate POS systems, the systems I built have served 2 million+ people so far this year. In general as some other people have said, I don't think you are asking the right question. It would help to know which POS software you're using.
If your SQL server is currently having downtime, WHY is it having downtime?
Is it hardware? Buy a newer system, nothing very fancy is needed, preferably a dual-raid system, for OS and data, but one raid to rule them all can work too.
Is it software? This is what seems likely to me. A newer system may not help. Restaurant POS systems aren't the most reliable in my experience. If the database is crashing or slow, it might have too much data in it for the POS software to cope with. Archiving the old data, or getting a fresh start & rebuilding all of the items might give you a much smoother system. (And yes, its a painful thought to break with sales history)
Failover is a good thing, but you need to know why the first system fails before you can be a second system will help. Vendors like to blame other systems and say things like "the database crashed", rather than their own product. Yet it was their product that fed bad data into the database, and then pulled bad data back out and crashed as a result.
Do you have a pen & paper system that your staff know how to write a ticket, and way to take manual imprints of credit cards? A high tech solution isn't always the most cost effective or even necessary depending on how many outages you have had and what their impact has been.
I would not push your main server to the cloud. Just make sure it is in a safe cold place, that stays inacessible to most people and is nowhere near the kitchen. Overall I think you'll get more reliability from 1 good server than trying to make 2 mediocre servers failover smoothly.
Use a "mirror" computer. A duplicate of the server that runs in parallel with the principal, in practice you have two computers operating as one. And if one fails you have the another one operating normally, giving time for you to fix what failed
Religion: The greatest weapon of mass destruction of all time
There is no cheap magic bullet, if there was, everyone would be doing it. You will either pay for licensing, pay for hardware, or both. Clustering is usually a nonstarter due to the expense of a SAN, you get a cheap SAN then you still end up with a lousy single point of failure. SQL replication may work but the POS software may or may not work under that configuration and the fail-over may or may not be automatic so its a real crap shoot. Your best bet is a single quality server, minimize the crap you install on it, preferably just SQL, get a solid properly rated UPS, and make sure it is all setup properly. You will get great uptime. A mismanaged cluster is much more liable to cause downtime than a properly cared for single instance server.
Nuclear war would really set back cable. - Ted Turner
It's extremely unlikely that any common point of sale system is going to be written to use MySQL. Tools to write to SQL Server or SQL Express databases are easy, common and built into all the later versions of VB/DotNet, and are commonly taught in basic programming classes. Micros and Aloha probably have programming staff able to take write to any major DB, but that's why their terminals cost $6,000 and more each. A budget solution generally employs budget programmers.
"Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
Replication is really easy to deal with - I 'get' to work with this everyday with hundreds of replicated SQL servers and it is bullet proof. I can set up a new server in minutes and not have to worry about it until something bad happens with the server. If the subscriber/publisher looses power, with merge replication it will pick up where it left off when power is restored.
2 servers running at the same time. Daily backups. If one fails, change the connection at the app level (ie: server 1 to server 2). Any POS app can switch the server it's pointing at in no more than a few steps.
I don't respond to AC's.
Actually, the way most vendor apps are written, the logic is all in the front end / web code and the database is just a big storage space. Switching from SQL Server to MySQL is just a connection string update in your config file. (www.connectionstrings.com)
I don't really agree with that approach, but I see the benefits of it from the standpoint of selling to companies that use different database platforms.
Says the man that has never had to administer a stable full of MSSQL servers.. I still have nightmares....
Do not look at laser with remaining good eye.
If the server is going to be running inside a virtual box anyway, why not considering Linux for the host OS?
Specially, it has a nice solution called DRBD - Distributed Replicated Block Device which can replicated block devices (like partitions) over the network and keep them in sync (think of it as a sort of RAID-1 over network, but which handles also nicely all the dirty behind the scene stuff for tracking change and keeping the block device in sync). Thus at any point in time, the main server and the mirror server contain exactly the same data. It's integrated into the mainline kernel, it has wide "in the wild" adoption (it's not a small project used by two labs in one university). And thus it is also nicely integrated in lots of management software (for exemple: pacemake/heartbeat).
Most virtualisation layers (VMWare, Xen, etc.) can use block devices directly as a store for their image. (Traditionnaly, a block device provided by a SAN fabric, but a black device replicated to the other sever over DRBD works exactly the same too). You can thus have the host-server (VMWare server, Xen hypervisor, etc.) running on both physical servers, with the main server having a running VMware image with the microsoft SQL service inside. And the other just sitting idly, with its virtual host in stand-by.
- If hardware maintenance is necessary (say SMART monitoring tool signal that one HDD is about to fail and needs to be replaced soon), most virtual host (VMWare server & Xen hypervisor) support "live migration" : The virtual guest jumps from one machine to the other with almost no interruption as long as the block device is available on both machines (traditionally done by having an expensive SAN fabric to which both servers are connected, but DRBD in dual primary mode works nicely too). Users see no interruption of service, and now the idle machine can be taken down for the necessary maintenance.
- it the current server crash due to some big hardware failure (say the HDD dies without any fore-warning), the mirror server contains the latest up to date copy of the same data, thanks to DRBD. The VMware image can be started from there. It will exactly look like if the Windows virtual machine simply got it its power interrupted and was rebooted from a NOT-cleanly-shutdown state. Service resume rather fast, without needing that the emergency maintenance be done on the down server.
DRBD will handle all the necessary resyncing once both servers are back online.
Last but not least, both DRBD and the various popular virtualizing solutions are nicely integrated with common administration tools (like pacemaker/heatbeat). It is thus possible to automate some scenario and make the other rather trivial to carry with minimal trained supervision.
All in all, it's possible to build a good redundand solution with off-the-shelf or cheap server parts, without shelling multiple thousands dollars for more expensive solution. (Specially, DRBD enables to do away with the expensive high-availability SAN fabric - you only need the "two cluster node" (which can be either beige boxes, or the cheapest server from your favourite brand), without the "some shared storage, or some such").
Also, interesting part: There's only 1 instance of the service running simultaneously (the virtual guest), so you don't need 2x the licenses for every piece of software.
Don't forget to do snapshotting inside the virtual server (so you can roll back to the latest working snapshot in case the virtual windows system gets corrupted).
Now throw in some software RAID-1 or RAID-5 on at least one of the nodes to better survive a HDD crash and the system starts to be ratter robust.
Think about back-up, namely:
- history. If any mistake happens, how can you go back to a few days ago? (For example does the windows software feature a way to do snapshots? or will you simply use the snapshot feature of the virtual host ?)
- remoting. it the restaurant burns down, how can you get your data (spe
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
You say you don't have the money for shared storage and HA, but I have to assume you mean doing that at the Windows level.
You might think of implementing all that on Linux & Xen and then making those now-redundant resources available to the Windows VM. You can get a few older servers for a song off some local IT shop that's upgrading - just make sure the CPU has hardware virtualization and probably none of the other specs will matter.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Microsoft SQL server is a fine product but like Oracle gets real expensive real fast...
OpenBravo POS and LemonPOS are both great open source POS solutions that have commercial support available. Also, Xymon can be used to monitor windows and/or linux service or executables, notify on downtime and restart or perform other scripted operations.
http://www.openbravo.com/product/pos/
http://sourceforge.net/projects/lemonpos/
http://xymon.sourceforge.net/
[RIAA] says its concern is artists. That's true, in just the sense that a cattle rancher is concerned about its cattle.
Both are incredibly irritating and near impossible to talk with online.
Both deny well established scientific facts and evidence in order to fit their worldview.
Both try to make their existing worldview fit the facts, rather than admit that their interpretation was wrong and adapt their worldview accordingly.
Moderate Christians like myself, who see no conflict between Christianity and science, and moderate atheists I know, who have nothing against religion and are simply unconvinced by the evidence for theism, get lumped in with the fundamentalists and their arguments and questions summarily dismissed by both sides.
I blame the Greater Internet Fuckwad Theory. Not every Atheist is Joseph Stalin and not every Christian is Eric Rudolph. Stop lumping everyone into those two camps.
Christian fundamentalists, stop going full retard and cherry picking what science you like. Scripture deals with the things we could not figure out by ourselves, like the Trinity. Science rests its presupositions on Christian philosophy, that the universe is orderly, understandable, and can be understood mathematically. Remember the words of Robert Boyle, “From a knowledge of God's work we shall know Him.”
Atheists fundamentalists, deal with the fact that the last 50 years of Biology and Paleontology has raised legitimate objections to Darwinian theory that need to be dealt with. The "Monkeys typing Shakespeare Theorem" doesn't cut it and everyone knows it.
Hand-waving and just-so stories don't convince either side and, if either side was so sure about their position, would not use them. My time is better spent discussing these things in person where both sides are far more sensible and civil.
Good night.
You have some advice on mirroring/Raid. Be sure to address your points of failure/threat: HD failure Use SSD(s) rather than a hard disk, mirrored if possible Power glitch Get a UPS or run on a locked-up laptop. Virus no internet connection (do not share LAN with any internet-connected PC) + remove USB connections and disable them in OS Equipment Theft take daily backups home or to a safety deposit box. Keep 1 for every day of week + 1 for every week + 1 for every month Lightning or motor surge good surge supressor(s) Equipment failure have duplicates available, immediate for low-cost, at shop for hi-cost; replace failing/old stuff.
Epitaph: At last! Root access!
"My wife and I own a mid-sized restaurant with a couple of Point of Sale (POS) terminals. The software, which runs on Windows and .NET, uses SQL Server on the back end"
Get rid of the backend and run the software directly on the POS terminals in a peering arrangement. That way each POS terminal provides backup for the others.
Uh....if you can't afford to buy decent hardware then why are you playing with MS SQL Server? Oh, that's right, you spent the money on licenses. Anyway, if you want fail-over you need to setup the server properly using the required hardware. You can't have something for nothing. Seriously, you used a fighterjet when all you needed was a Cessna. You platform choice reaks of microsoft hammer syndrom. Whoever developed your system knew Visual Studio .NET and saddled you with this costly burden.
If it were me I would use MySQL, automated backups, and an image or VM standing by with extra commodity hardware. Hell, just use this scenario with your current setup.
Another thing, I hope you're using quality harware even if it isn't redundant. Or is this a wal-mart setup?
I object to power without constructive purpose. --Spock
You are just running one restaurant, you don't need an over-technical solution.
Let's get realistic here. The best fall-over for a restaurant is that you need to train your staff how to use a manual (pen & paper) system in case there is any technical fault. The skills to use a manual ordering system in a restaurant is very important. Have a calculator at each register and ensure that there is a way that the tills can be opened by the person in charge at the time without the application. Your security system to watch over the tills should be independent from the servers used for your register software so that the tills don't become free reign if the server goes down.
If you really insist on redundancy from a technical level, use two physically separate servers. No Dual-PSUs, etc. this won't protect you from all faults. Only the HDD should be in a RAID mirror (not just because they are unreliable, but because it is a lot of work to rebuild a server from scratch if a drive fails), and only two servers (geographically separate if possible) will truly protect you for whatever goes wrong, so instead of buying a really expensive $10,000 server that has al the redundancies built in, buy 2x $1,500 servers (total $3000) that don't have this but are both powerful enough to do the job.
Then set up whatever fallover however you like and depending on your application requirements. Maybe the register application doesn't playa nice with SQL clustering or having two instances installed on the same network - I don't know your application. Maybe you have a parallel install of everything on the second server, and the SQL database mirrors/replicates every half an hour. Maybe you use VMware fallovers. Maybe you have a Cold Server (Microsoft don't charge license fees for a Cold Server so this will only cost you for hardware) and if the server goes down you have a set procedure for your staff that if they have a problem, for them to follow instructions to switch off the Main server and switch on the Cold server.
If you have the Main server constantly backing up the SQL Database to an external HDD, you could also include in your instructions to pull out plug with the "X" sticker from the front of the case and plug it into the box in the other room where it has an "X" on the similar looking box (really dumb it down for them) and then press "Y" (the power button) and have a script setup that automatically restores the database from the external drive when the server loads. The scripting for this would be so basic it would take you 5 minutes of looking up how to make a job in task scheduler, how to make a script run at startup and how to use the SQLCMD command to backup/restore a database.
Sure there would be manual steps and there is a small amount of downtime during those manual steps, but you will save thousands in not having to relicense Windows Server, SQL Server (if you are not using Express) and CALs.
relinquishing my mod points for this post.
(Background: I've dealt with the custom software business in a different industry; I have no restaurant experience outside of watching every single episode of stuff like Kitchen Nightmares--US and UK. My industry does operate in a manner similar to how restaurants would seem to function.)
The biggest mistake I see out of smaller businesses is that they believe that massive amounts of redundancy. Will people die if your computers go down? Yes, spend the money. Is your business actually at risk of failing due to two days of your computers being down? Yes, the expense is justified.
So what is this software used for in a restaurant? Taking orders and distributing them to the appropriate kitchen stations. Maintaining inventory. Helping manage re-ordering. Maybe it extends to timekeeping for employees. Less likely to HR and accounting beyond that.
If you fall into this category, then you just go to paper if the computers fail. Keep a supply of paper ticket books around. Make sure you always keep a copy of daily reports of sales and inventory (either a hardcopy or a PDF or otherwise exported copy on disk on another system). If you have to go to paper then once you are back online you simply bring your inventories back up to date. If you have to place an order while your system is down, go off your last inventory report and your sales since then to do a replenishment order.
Not having technical staff on hand shouldn't matter. If it fails, they go manual then they call you or whoever their normal chain of support would be. You reconcile things as needed once the system is back online.
The world will not end if you have to rely on paper and calculators for a day or two.
Now if you really have the money to spend and really feel that your business is in serious danger by losing your computers for a day then go ahead with the expense.
Many posters above covered the basics at this point. The high end solution would be full redundancy of hardware and software. Dual servers with dual everything, live replication, etc. This is by far the most expensive route, but of course the most reliable and lowest down-time.
My choice?
We're talking about a restaurant. I can't possibly see this having a serious horsepower requirement. I would indeed have a basic server, but I would run the database (and server portion of your POS system, if it has one) inside a VM, probably a free or one of the less expensive versions of VMWare. I would regularly shut it down and copy the VM to either a second (inexpensive) server or, better yet, to a manager's desktop PC that has enough RAM to run the VM. Regular database dumps would be copied as well.
If it fails, you bring up the VM copy, load the last DB and you should be good to go.
Not having technical staff on hand shouldn't be an issue. You're a restaurant owner, and if the business is doing well at all you are probably working 80+ hour weeks and on call at a moment's notice any other time that the business is open. If you really can't, then get yourself a regular computer consultant that knows what knows your recovery plans and is on-call as needed. It may cost you a couple hundred a month to have one basically on retainer, but that's vastly less expensive than hiring dedicated technical staff.
Don't forget things like firewalling your servers if at all possible, and keeping at least one spare of your client PCs ready to swap in (particularly your receipt and ticket printers!).
If your .Net application allows you to configure the database connection string to specify a Failover Partner, then you could set up SQL Server database mirroring, either in High Performance or High Availability modes.
Laissez lire, et laissez danser; ces deux amusements ne feront jamais de mal au monde. - Voltaire
MEMTEST the server for 24 hours. If you see any errors - it may be a hardware issue.
As a starting point, I would recommend virtualizing your servers (i.e. one or more physical Hyper-V servers running multiple virtual servers) in which each server only has a single role. So your domain controller is a different server from your SQL Server, and your app server is another server altogether. When any server has to be updated, the others stay up, as you have OS-level separation. The granularity also gives you flexibility in terms of how many physical servers you run and also allow for seamless upgrading (you can add a physical server and then just bring a virtual server down on one machine and back up on another...takes a few seconds of downtime.
Not if 'We can't afford several thousand dollars' is on your requirements list. The price difference makes up for a lot of things...
OR MORE... 99.999% uptime on Windows Server 2003 + SQLServer 2005.
* Hey, if it works in a high tpm environs "bulletproof & bugfree" with SOLID UPTIME? It'll work here...
APK
P.S.=> Of course, NASDAQ uses "failover clustering" (which is what I would have suggested - "DO WHAT NASDAQ DOES") too...
THAT might "violate" the article poster's co$t constraints, but if it works for NASDAQ in a HIGH TRANSACTION PER MINUTE (tpm) ENVIRONS? It will work for his restaurant... apk
If I were on a tight budget, here is what I would do:
Purchase two linksys Gig-E switches, and make sure that half of my POS terminals were running to one, and the other half to the other switch. Purchase a couple of desktop computers on ebay with enough RAM for your requirements, two hard drives each (so you can do hardware RAID) and a hardware RAID controller, and an additional NIC with dual ports. You'll want to stack the two switches, or at least trunk between them and turn on spanning tree.
On the dual port NICs on each server, run one Gig-e cable to one switch, and the other cable to the other. Create a crossover cable and run it directly between the two servers to the onboard NICs (this will be used for DRDB). Download CentOS. Set the dual port NICs in an Active/Passive Mode. Download DRBD: http://www.drbd.org/ And create a disk partition that is mirrored between the two nodes with GFS running on top of DRBD (you now have shared storage, functionally similar to a SAN). Download KVM, or some other virtualization product and run your SQL server/.Net application as a virtual machine. Given your environment, it doesn't sound like you'll need high IOPS, and so this setup should work fine. You could use the clustering software that comes with CentOS, but in the case of a physical server failure, it would probably be easier to simply spin up the instance of your virtual machine manually on the 2nd server with the KVM software.
Granted, you could make things work seemlessly, and I've done that before, but it takes a lot more work than I'm guessing you would wish to spend on this project. By approaching things the way I mentioned, you could easily have a small list of directions that your wait staff (or shift manager) could follow in the event of a physical failure. It would be easy for them, and you could ensure that your business operates with excellent continuity. Your total expenditure if for everything I mentioned should come to about $1,200 - $1,500 for the entire environment. Finally, in the event hat Murphy's Law visits your establishment, be sure to have a backup strategy.
Now, everything that I discussed isn't especially easy, but there are TONS of resources if you spend some time on Google. And, it is definitely all very doable. My background is in designing highly available infrastructures internationally with budgets ranging from mom & pop small town establishments to multimillion dollar infrastructures.
Good luck, feel free to contact me at: ATheoryOfTruth "at" Gmail . Com
Cheers,
Andrew
how about 2 inexpensive hardware platforrms + linux (your choice of fflavor ) vmware server free + some perl to monito the vmheartbeat from the vmware API and a bit of logic to start / pause vm's across the two boxes ? , you could also share filesystem with GFS2+drbd , Linux HA is also something you might want to look at , i'd also look at OpenNas for shared storage if you want to go that way
worked for me , i have a 2 domain controlers + Scom running like that
reg.