Sending Excess Load To the Cloud?
TristanBrotherton writes "Cloud computing seems to be a good choice for startups like ours, looking to scale easily with users. (We're providing a series of Web services, assets, and Web applications to users of our mobile client.) There are the obvious choices of Google, Amazon, and smaller shops like EngineYard. The biggest issue we have in choosing cloud computing to run our applications is trust in their robustness. If the provider goes down, we suffer. In traditional hosting environments we mitigate this with multiple sites / vendors. It's not really feasible to host on multiple compute services, so I wondered if a better option might be to set up a small (perhaps two servers) origin infrastructure in a traditional manner at a datacenter, running our applications, but then send excess load, or in the event of our origin servers failing, all load, to compute services. This would give us the best of both worlds. Has anyone done this, or had experience in designing Web applications to scale seamlessly across both environments? Is there particular load-balancing hardware we can use to do this?"
Unless your "cloud" provider offers a service level guarantee with teeth, is contractually obligated to continue to provide the service for some period of time, and has sound financial fundamentals, this is risky.
I think we'll see a big shutdown of money-losing web services over the next year.
The answer is to not get tied into a single service provider. You need a cloud computing solution that is standard-based (formal or defacto) and that lots of providers are supporting. And you have to be prepared to migrate your stuff if/when the industry moves on to the next version of the standard ... or the next "big leap forward" after cloud computing.
And that is all hard to do.
If your sole consideration is application availability, then your idea might make sense. But since you said you don't trust the application hosting company's "robustness," do you still trust them to protect and secure your data adequately? In other words, if you don't trust your IT service supplier in one dimension, why would you still trust them in other service quality dimensions?
Have you thought about establishing a contract with a formal Service Level Agreement (SLA), including penalties and escrow (or equivalent) for non-performance? That would seem to be a much more straightforward and comprehensive way to establish "trust."
If you are running a web-based, hosted financial application, outsourcing "to the cloud" is a non-starter. If you are hosting pictures of kitty cats, the cloud can be an excellent resource.
Throw a server up, upload some files, start up a PG or MySQL database, and integrity is easy. But as soon as you introduce the 2nd system, integrity issues start jumping out of the woodwork. It gets worse with each additional node. Redundancy isn't just fancy-sounding, it's damned hard to do right, and as soon as you introduce it, you have to accept an elevated error rate because the number of things that can go wrong go UP, even as the number of catastrophic system failures drop.
For a great example of redundancy in action, take a look in the mirror. You have individual cells dying by the millions every minute. Your memory is fuzzy at best, your pattern-recognition in your brain frequently sees things that aren't there, and you make stupid mistakes every single day. And that's fine, because the overall system is pretty damned redundant and resilient. A mash of protein goo and calcium deposits able to sustain one of the most complex information systems around, reliably, 24x7, for an average of 70 years or so apiece.
Good luck getting any kind of hosting platform to maintain that kind of uptime, no matter the expense! But in biology, minor errors are so commonplace that they are hard to catalogue, let alone count.
So pick your battle, and realize that high-performance, high-redundancy clustering is very, very difficult to do well.
In the meantime, spend money on good quality hardware, and use top-notch colo hosting. The cost of doing it right is actually significantly lower than doing it "on the cheap" so spend money where it counts (good quality infrastructure) and save where it matters. (EG: public opinion) It's almost odd - if you look for the very, very best colo, regardless of cost, you'll find that their monetary cost is probably one of the lower ones around. (head scratcher) I've found this to be rather consistent with several reviews under my belt.
Also, I find it best to use whitebox systems with midrange hardware. These are quality, high-performance hardware developed with everything but the name brand. In my case, I've standardized on 1U multicore X86/64 systems with hot-swap, high performance 15k SCSI drives put out by Tyan and SuperMicro. There are a large number of dealers of such systems, my current favorite is Aberdeen Inc. They can sell you an amazing amount of performance and reliability for around $2500.
This is the stuff that Sun will sell you for $8,000/pop. They will stand up to day-in, day-out heavy use for years, with hundreds or thousands of users every day, millions of website hits per day, etc. They are high performance. This is quality hardware. And with the money you save, you can have an immediate hot backup for less than the cost of the "premium support" of the big guys, and more redundancy in the meantime.
My $0.02. Since it's free advice, you're free to use it as you see fit!
I have no problem with your religion until you decide it's reason to deprive others of the truth.
Work out how to build the servers as cheaply as possible. If peak load starts to get troublesome, add some more servers.
Cloud computing is a buzzword. Big server farms are may be dull, but it's a tried and tested technology that works. Ask Google.
Data availability - Data replication across multiple sites. Not new.
Data portability - How is this new. And please don't try to pretend that the use of IMAP by GMail is in any way innovative.
Resource expandability/shrinkability - This just isn't available outside of the mainframe world. Unless you've written software for some sort of funky cluster thing (unlikely) then the cloud is something like VMWare ESX, and the maximum expansion you're ever going to get is to the size/capabilities of a single one of the racks. AFAIK it's not possible at this time to have a single OS image (of a standard OS you program normally for) across multiple x86 machines.
Have you tried managing racks worth of servers in many locations?
That said, most cloud services today ARE very expensive. EC2, for example, can be trivially beaten with managed hosting, and in some cases totally crushed by maintaining your own servers.
What cloud services give you (and you pay through the nose for) is the ability to scale quickly. Trouble is, most people never need to scale that quickly.
Using clouds for "overflow" from a cheaper base setup is not a new idea, and it's definitively a good one. Particularly since it allows you to cut it a lot closer with your base setup. Without overflow capacity elsewhere, you need enough extra capacity in your base setup to handle reasonable growth plus any spikes. With overflow capacity using a cloud service, you only need to handle enough of your daily traffic that whatever you end up using of the overflow capacity is cheaper than adding more servers to your base. As soon as it isn't, you add more servers.
"Timesharing service" is available.
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Why not go for dedicated servers for each app?
Does this help if the provider's servers are out of your control? As the submitter said, "If the provider goes down, we suffer."
Seems to me that if you can't afford someone else's servers to go down, you run your own or find someone you trust to do so on your behalf. Simple as that.
I call myth. Cloud providers benefit heavily from economies of scale which is something that you as a little startup simply can't. On Amazon you can run a "midrange" server (i.e. ec2 large-instance) with plenty of traffic and a few hundred gigs of persistent storage for roughly $350/month. That is pretty close the amount that elsewhere you'll pay monthly for some empty rackspace and a bit of traffic without a single server in it.
As a rule of thumb you may assume circa $300/month for each additional server on amazon. Without upfront hardware-costs, without maintenance costs (as in fixing and replacing stuff if it's your own hardware), and with a provisioning-latency that is really hard to beat.
Yes, amazon is not "cheap" by any means. But it cannot be "trivially beaten" either.
Wow, he's not taking his meds today. Gmail is a trap? How? It offers email that you can access with IMAP. Keep a copy and you can just switch later down the road - even run IMAP locally.
I also don't understand his mantra of "keeping information in your own hands". I'd contend that some of these big outfits know more about security and reliability than it might be possible to afford as a little guy. So long as the solution is relatively standard and portable across providers, I don't see an issue.
If you use a proprietary program or somebody else's web server, you're defenceless. You're putty in the hands of whoever developed that software."
See, that's only true if you can't get your data off in a standard format. There are certainly web apps where his warning applies, but this statement needs to be qualified or it is FUD. In particular, picking on Google apps is fairly counter-productive. Their "cloud" is pretty proprietary, though based on open standards so this is already changing... Gmail has IMAP, Google Apps all export to many open formats, Google Calendar exports to at least iCal format. What the devil is he getting at?
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.