Sending Excess Load To the Cloud?
TristanBrotherton writes "Cloud computing seems to be a good choice for startups like ours, looking to scale easily with users. (We're providing a series of Web services, assets, and Web applications to users of our mobile client.) There are the obvious choices of Google, Amazon, and smaller shops like EngineYard. The biggest issue we have in choosing cloud computing to run our applications is trust in their robustness. If the provider goes down, we suffer. In traditional hosting environments we mitigate this with multiple sites / vendors. It's not really feasible to host on multiple compute services, so I wondered if a better option might be to set up a small (perhaps two servers) origin infrastructure in a traditional manner at a datacenter, running our applications, but then send excess load, or in the event of our origin servers failing, all load, to compute services. This would give us the best of both worlds. Has anyone done this, or had experience in designing Web applications to scale seamlessly across both environments? Is there particular load-balancing hardware we can use to do this?"
Unless your "cloud" provider offers a service level guarantee with teeth, is contractually obligated to continue to provide the service for some period of time, and has sound financial fundamentals, this is risky.
I think we'll see a big shutdown of money-losing web services over the next year.
Please don't use it. Every time you use a buzz-phrase God kills a kitten.
http://www.zombieapocalypse.tv/
The answer is to not get tied into a single service provider. You need a cloud computing solution that is standard-based (formal or defacto) and that lots of providers are supporting. And you have to be prepared to migrate your stuff if/when the industry moves on to the next version of the standard ... or the next "big leap forward" after cloud computing.
And that is all hard to do.
yup, im in slashdot alright...
If your sole consideration is application availability, then your idea might make sense. But since you said you don't trust the application hosting company's "robustness," do you still trust them to protect and secure your data adequately? In other words, if you don't trust your IT service supplier in one dimension, why would you still trust them in other service quality dimensions?
Have you thought about establishing a contract with a formal Service Level Agreement (SLA), including penalties and escrow (or equivalent) for non-performance? That would seem to be a much more straightforward and comprehensive way to establish "trust."
Break n' bake servers work out really well with Amazon's EC2. I've never had to use it for anything really critical but so long as you maintain a set of closely sync'd liveUSB and AMI images I don't see why you'd have a problem. Just make sure that your existing failover mechanisms automatically initiate the backup plan, notify you, and isolate your local system for forensics or repair, since a security breach that will take down your local system has a high likelihood of succeeding in the cloud.
It's very probable that none of these offerings will work well if your application integration is not aware of itself as a group of applications and services.
Think about it: Install Apache on one host. OK. Now, two hosts... Well, do you round robin DNS, or do you run a squid reverse proxy, do you buy something else...
Next, how are you going to monitor this monster, Nagios or OpenView... or something else.
How many people are responsible for this puppy?
Oh, yeah... And I'm just talking about static web hosting, you start having all kinds of fun when you want to track user sessions, etc.
My advice to you is to look into an Application Service Provider. Make them do all of the integration work.
If you can afford Internet connectivity to a pair of servers, you can probably afford an Application Service Provider.
If you are running a web-based, hosted financial application, outsourcing "to the cloud" is a non-starter. If you are hosting pictures of kitty cats, the cloud can be an excellent resource.
Throw a server up, upload some files, start up a PG or MySQL database, and integrity is easy. But as soon as you introduce the 2nd system, integrity issues start jumping out of the woodwork. It gets worse with each additional node. Redundancy isn't just fancy-sounding, it's damned hard to do right, and as soon as you introduce it, you have to accept an elevated error rate because the number of things that can go wrong go UP, even as the number of catastrophic system failures drop.
For a great example of redundancy in action, take a look in the mirror. You have individual cells dying by the millions every minute. Your memory is fuzzy at best, your pattern-recognition in your brain frequently sees things that aren't there, and you make stupid mistakes every single day. And that's fine, because the overall system is pretty damned redundant and resilient. A mash of protein goo and calcium deposits able to sustain one of the most complex information systems around, reliably, 24x7, for an average of 70 years or so apiece.
Good luck getting any kind of hosting platform to maintain that kind of uptime, no matter the expense! But in biology, minor errors are so commonplace that they are hard to catalogue, let alone count.
So pick your battle, and realize that high-performance, high-redundancy clustering is very, very difficult to do well.
In the meantime, spend money on good quality hardware, and use top-notch colo hosting. The cost of doing it right is actually significantly lower than doing it "on the cheap" so spend money where it counts (good quality infrastructure) and save where it matters. (EG: public opinion) It's almost odd - if you look for the very, very best colo, regardless of cost, you'll find that their monetary cost is probably one of the lower ones around. (head scratcher) I've found this to be rather consistent with several reviews under my belt.
Also, I find it best to use whitebox systems with midrange hardware. These are quality, high-performance hardware developed with everything but the name brand. In my case, I've standardized on 1U multicore X86/64 systems with hot-swap, high performance 15k SCSI drives put out by Tyan and SuperMicro. There are a large number of dealers of such systems, my current favorite is Aberdeen Inc. They can sell you an amazing amount of performance and reliability for around $2500.
This is the stuff that Sun will sell you for $8,000/pop. They will stand up to day-in, day-out heavy use for years, with hundreds or thousands of users every day, millions of website hits per day, etc. They are high performance. This is quality hardware. And with the money you save, you can have an immediate hot backup for less than the cost of the "premium support" of the big guys, and more redundancy in the meantime.
My $0.02. Since it's free advice, you're free to use it as you see fit!
I have no problem with your religion until you decide it's reason to deprive others of the truth.
Distributed computing of any kind is complex and not something to be undertaken with no experience or assistance. Hire someone who knows their stuff to help you out. Being with a business case and don't be surprised if running your own cloud turns out not to be the way to go.
These posts express my own personal views, not those of my employer
Work out how to build the servers as cheaply as possible. If peak load starts to get troublesome, add some more servers.
Cloud computing is a buzzword. Big server farms are may be dull, but it's a tried and tested technology that works. Ask Google.
Data availability - Data replication across multiple sites. Not new.
Data portability - How is this new. And please don't try to pretend that the use of IMAP by GMail is in any way innovative.
Resource expandability/shrinkability - This just isn't available outside of the mainframe world. Unless you've written software for some sort of funky cluster thing (unlikely) then the cloud is something like VMWare ESX, and the maximum expansion you're ever going to get is to the size/capabilities of a single one of the racks. AFAIK it's not possible at this time to have a single OS image (of a standard OS you program normally for) across multiple x86 machines.
Have you tried managing racks worth of servers in many locations?
That said, most cloud services today ARE very expensive. EC2, for example, can be trivially beaten with managed hosting, and in some cases totally crushed by maintaining your own servers.
What cloud services give you (and you pay through the nose for) is the ability to scale quickly. Trouble is, most people never need to scale that quickly.
Using clouds for "overflow" from a cheaper base setup is not a new idea, and it's definitively a good one. Particularly since it allows you to cut it a lot closer with your base setup. Without overflow capacity elsewhere, you need enough extra capacity in your base setup to handle reasonable growth plus any spikes. With overflow capacity using a cloud service, you only need to handle enough of your daily traffic that whatever you end up using of the overflow capacity is cheaper than adding more servers to your base. As soon as it isn't, you add more servers.
I'm surprised \. is posting this without referring to the Stallman interview that was all over the nerd sites like reddit yesterday. It is very relevant. You missed it? Come on guys, you're not always the fastest and I don't care, but this is a fail.
I thought of this problem myself for a while, when playing around with the idea to try out the "cloud". You could use pound, a lot of its use for cloud computing has been discussed in the interwebs already. Biggest point of concern will be if the load balancer keeps your ssl data encrypted.
molmod.com - computing tips from a molecular modeling
Why not go for dedicated servers for each app?
Does this help if the provider's servers are out of your control? As the submitter said, "If the provider goes down, we suffer."
Seems to me that if you can't afford someone else's servers to go down, you run your own or find someone you trust to do so on your behalf. Simple as that.
I call myth. Cloud providers benefit heavily from economies of scale which is something that you as a little startup simply can't. On Amazon you can run a "midrange" server (i.e. ec2 large-instance) with plenty of traffic and a few hundred gigs of persistent storage for roughly $350/month. That is pretty close the amount that elsewhere you'll pay monthly for some empty rackspace and a bit of traffic without a single server in it.
As a rule of thumb you may assume circa $300/month for each additional server on amazon. Without upfront hardware-costs, without maintenance costs (as in fixing and replacing stuff if it's your own hardware), and with a provisioning-latency that is really hard to beat.
Yes, amazon is not "cheap" by any means. But it cannot be "trivially beaten" either.
Wow, first time I have ever been accused of reading a manual...