EC2 Vs. App Engine Vs. GoGrid Vs. AppNexus
snydeq writes "InfoWorld's Peter Wayner delves into the ill-defined realm of 'cloud computing,' providing a deeper look at four shared services: Amazon EC2, Google App Engine, GoGrid, and AppNexus. Offering wildly divergent amounts of hand-holding at various layers in the stack, the services simplify your workload but force you into a set, 'ball-and-chain-computing' routine that you may not prefer. Sure, the services allow you to pull CPU cycles from thin air whenever you need to, but they can't solve the deepest problems that make it hard for applications to scale gracefully, Wayner writes. He describes these 'clouds' as an evolving experiment, rife with potential but 'far from clear winners over traditional shared Web hosting.' The sobering look at the trend includes a QuickTime tour of each service — EC2, App Engine, GoGrid, AppNexus (those links all .MOV)."
Even after reading the wikipedia article on Cloud computing, I still can't give a good definition of it. I know the general concept but if a non-tech person asked me to describe it, I'll give a blank stare.
Every geek has some sort of website, programming or computer project. Here's mine: www.youtasteit.com . What's yours?
I'd choose Google App Engine. Since no one really knows what cloud computing is, and no one knows what google does, I think they make a good fit.
oh wait. I do know what google does - It makes the internet better... and it prints money (I guess...)
In ten years, corporate data centers will be like COBOL is today. There will still be a lot of legacy data centers manned by dinosaurs. The cool kids, young and old, will be in the cloud.
you mean the definition of cloud computing is still cloudy?
Finally, a burst of common sense on the latest hype. Hosted servers have offered many of the benefits you get out of "cloud" computing for years, without locking you into a particular vendor or platform. With virtualization, you should be able to build your own images and farm them out to hosting companies, using your technology and platform of choice. Clustered ESX and SANs already give us the resource scalability we need for most systems, partitioning finishes the job. You can just pay a hosted server company to host your vmware image on their ESX cluster and scale up your storage as needed on their SAN. The key is that YOU build a scalable design.
I highly doubt a majority of businesses are going to lock themselves into one hosting provider's specific development platform just to take advantage of hosted servers that push themselves into the next layer.
Don't worry, multiple experts agree that there is a clear consensus that there is no real consensus on what cloud computing is.
I hope I didn't brain my damage.
Comparisons are OK, but let's look at reliability. EC2 is not the same as S3, but the recent fiasco with S3 and SQS should give people pause before considering using any other Amazon cloud services. Two of my clients were hit with this over the weekend.
I don't know what kinds of volumes (traffic and hosting) Google AE is handling at this point, but at this point I think I would trust Google more than Amazon. One of the issues with the S3 downtime for many people was the fact that Amazon itself (and all its properties) continued to run perfectly while all the sites that hosted images and other content with them failed. Does Google use its own infrastructure to host AE? I don't know, but if they do I'd trust them a hell of a lot more than AWS.
At this point I'm thinking I'm not going to recommend AWS anymore.
Web2.0: I love when people Flickr my cuil and digg my boingboing until my google is reddit and I start to yahoo
Just get the EC2 and S3 plugins for Firefox and it's really easy to fire up instances and manage them. Sure, there's a learning curve, but once you really get it, it's awesome.
I'll take a stab at it, though, someone is bound to try and correct me.
I would categorize cloud computing as derivative of grid computing, if you will. You throw some crap at the beast, but unlike grid computing, there can be many independent cells working completely disconnected from the rest, possibly even unaware of them or even unable to communicate between one another.
Like the clouds in the sky, they don't need to be connected or aware of each other for it to rain.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
AFAICT, they aren't intended to. The deepest problems are software problems for which there is no general solution, only problem-specific solutions for each particular task; what they are intended to deal with is the hardware problem that having a scalable software solution is of limited value if you have a fixed pool of hardware and have to go through disruptive upgrades when you expand that pool of hardware (and deal with the associated capital costs.)
Cloud computing services are, largely, tools to help dynamically "right-size" hardware, changing it from a capital investment that requires predicting the future well to plan right to an operating costs that can be quickly adjusted based on changing needs. Complaining that they don't solve the fundamental problems of software scalability seems to be missing the point.
I run a small startup in the Boston area and have been using Amazon EC2 (plus S3, SQS, and the rest of the AWS family) for the last year. It's worked for us like a champ. A little downtime in the beginning plus some S3 outages, but with the right backup, failover, and restore procedures in place we've gotten reasonable uptime.
The big requirements for us were the following:
1. Ability to move our website (and code base) elsewhere if needed. Could be in-house, to another cloud provider, etc.
2. Minimize up-front cost and allow for massive scaling if needed
3. Cost competitive servers/computing over time
4. Cost competitive storage/disk over time
App Engine fails the first criteria, since (at least currently) you can't build a BigTable application on anything but Google App Engine. "Cloud computing" in general beat out traditional hosting on the second, third, and fourth points. I hadn't checked out GoGrid or AppNexus at the time, but other competitors (Sun, etc.) couldn't match Amazon's price-performance specs.
So, with all of those requirements, Amazon EC2 won out and I'm a happy customer.
I'm not familiar with all of them, but with amazon's service, it doesn't "spin up more servers to handle demand" by any stretch of the imagination (unlike what the name infers). You'd have to build an application that does this. Sure, it makes ordering and setting up new servers easy, but it still has to be done by your program. With google's system, there is no need to even worry about scaling up, because it just looks like one system. Unfortunately, google's system is way to limited for anything but customized, simple db apps. I can't wait for it to expand it's feature set.
-- these are only opinions and they might not be mine.
The cost analysis was really what did it versus our managed hosting plan (1/10th the cost per month). Auto scaling and healing of the application cluster was also a benefit. To scale with a traditional host meant getting locked into a contract for the added server(s).
One thing about ec2 is that it forces you to use best practices for disaster recovery. Instances don't commonly just "disappear" but you need to plan for it. Well tuned ec2 images can have your site up and restored from backup automatically within minutes.
ec2 / s3 is far from perfect and certainly won't meet everyone's needs. The downtime s3 has seen (like last weekend) would be devastating to some businesses. Of course even with a traditional host you may have downtime due to truck crash or other random act.
Sometimes my arms bend back.
I'll get modded down by all the grammar nazi nazis, but it has to be said: Correct capitalization of the abbreviation "vs." would have made the title of this article much more readable. But that would have required editing.
So you're saying you fail Rowell's Extension to Einstein's Test of Comprehension? Sad, sad day.
Einstein: You do not really understand something unless you can explain it to your grandmother.
Rowell's Extension: You totally don't understand something if you can't even explain it to a bunch of geeks. Also, they'll probably laugh at you.
I see a ton of value in "cloud" computing ... but in some cases, I'm not 100% certain what the difference between a cloud and a classic farm or cluster really is.
I have a simple public-facing SOA call that needs to scale to hundreds of thousands of calls per second, with automatic failover and preferably automatic scaling. GAE gives me some of that; EC2 gives me almost none of that, without something like RightScale.
I've talked to the AppNexus people a bit ... not as cheap to get into, higher performance than GAE ...
GoGrid doesn't seem like anything special. Rackspace seems to be the same thing.
We have a SaaS-aaS startup, Apprenda, here in the Albany area ... curious to learn more about them, but they don't appear to reply to email inquiries.
There's a cloud computing panel next week, Thursday the 29th, in NYC, worth checking out if there are still openings.
I am, therefore you think.
"Cloud Computing" : Definition hazy, with a 30% chance of being understood.
...any Web site filled with an endless stream of mostly forgettable comments trolling for reactions from the rival fans
I can't think of any site to fit that description...
Need to type accents and special characters in Windows? Use FrKeys
I think the best way i've heard it explained is:
"When details of implementation are sufficiently hidden away that you no longer have to think about them, people often draw a 'cloud' around it, just like you do with the internet where (most of us) don't have to worry about all the wires and the protocols but it's just there, and it just works.."
Cloud computing is trying to draw the same cloud around.. computing (resources), you don't have to worry about connectivity, electricity, how to make db's and file systems scale across systems.. it's an abstract cloud that's just there without having to worry about it.
This is a new hype name for gridRPC.
I tried to set up an app on Google's AppEngine, and got an error saying that they're out of space. They'll email me when space is available.
That somewhat deflates the promise of great scalability, etc.
well after reading the wiki of "cloud computing". I think p2p is better because it isn't controlled.
It's something to do with selling books, right?
Its cumulus all the way down
Caesar si viveret, ad remum dareris.
First time i heard a vendor use it was a week ago.. Now its everywhere.
---- Booth was a patriot ----
Until seeing this article I had no knowledge of the existence of either GoGrid or AppNexus. After attending 3 different talks/sessions about Amazon Web Services among TSSJS & JavaOne confs, I had begun to play with AWS. I really like EC2 & S3. I'm still trying to get my mind around the concept of "eventual consistency" which the speaker of 2 of these talks told me to look into. Presumably if you use the AWS, you must architect your applications to tolerate eventual consistency. Once I well-understand the other AWS services (SimpleDB, SMQ, etc.) in the context of eventual consistency I think I'd be able to see how to make good use of Amazon in real-world production. There was also a presentation by a company that uses AWS in production. What they use, how they do things, and the tradeoffs. The specific example had to do with news media and the need to encode a video clip into various quality, resolution, and format variations for distribution in the media website and footage archive. Naturally lots of video encoding lends itself well to cloud computing for on-demand processing capability as well as storage of all the generated artifacts. Pretty cool stuff!
Well, the article was pretty superficial. In my opinion, the "new" part of cloud computing is the distributed storage interface that is completely transparent with regards to it's implementation. Think memcached, but without having to list the servers. So, Google BigTable and Amazon's SimpleDB are what we are really talking about here. The rest is just hosting + dynamic provisioning + a few useful SOAs or APIs tossed in.
And here is the cheat sheet the article should have included:
EC2:
1. You can use your existing LAMP stack.
2. Scales by explicitly adding servers
3. Load balancing is an explicit service
4. Clustered database store is BASE, not ACID compliant
5. Generally more flexible but leaves many problems to the developer
6. More appropriate for legacy applications
GAE:
1. You must write in python and store data in the GAE datastore
2. Scales transparently
3. Load balancing is transparent
4. Clustered database store is ACID compliant, but scope is limited to "Entity Groups"
5. Generally less flexible but with the tradeoff of saving work for the developer
6. More appropriate for new applications
on Go Grid...they bill you repeatedly for grid you have not used, then refuse to refund your money. They have a "pay as you go" plan, you pay, you pay, you pay, but nothing goes!!
"My immediate reaction is "WTF? What kind of moron doesn't make things 64-bit safe to begin with?" Linus
Seriously? Cloud computing is rife with potential?
I make my share of grammar mistakes, but I still cringe at some of the vocabulary abuse that makes it through on this site.
From the way cloud computing is implemented with EC2, my basic definition of cloud computing is that it is "On Demand" computing. Unlike the Internet, which is a series of tubes, cloud computing is like a bunch of trucks that you just dump stuff on. If your servers get too busy, just dump the work on more trucks.
For instance, it allows a small web site to survive the slashdot effect by starting up a dozen servers for a few hours or days, costing much less than having a dozen servers running 365.2425 days a year.
Additional benefits are much higher availability and connectivity. Anyone can sign up for a cloud computing account and have a virtual server colocated at an Amazon or Google datacenter.
Beta indeed.
I decided to sign up for the GoGrid Public Beta $50 Trial.
The first page asks for your email address, a password, and a pre-filled promo code.
The next page asks for your personal information and CC details for billing past the free credit.
All this information was filled out correctly but their 3rd party merchant biller failed to process the details and returned an error. This may have been a glitch or it is possible that the biller does not support non-US transactions
In a second attempt to sign up, I was told that my email address was already registered and was subsequently denied. So I tried signing in, which failed.
I look forward to trialling this service when these simple but show-stopping creases are ironed-out.
--
There is a subtle, but important, difference between peeing in the pool and peeing into the pool
[Rent This Space]
One thing I wonder about is whether these cloud services will suffer the same problems as other centralized infrastructure installations - eg. such as the power grid.
Presumably Amazon has some actual very high but finite number of physical servers that is supporting EC2. What happens when (just for example) Christmas comes around and there are huge spikes in activity for specific hours of the day as people do last minute christmas shopping?
When across the board a large number of their customers suddenly allocate dozens more instances, are Amazon going to be able to meet that demand? Do they really have enough servers sitting idle to magically allocate enough to meet peak demand at any one time of the year?
It will be interesting to see if any such events arise.
One thing that is interesting to note is that Google, Amazon (and AppNexus, I think) do NOT offer Windows machines by the slice. Now, in the off chance that you are looking for a cloud solution that requires windows tools, and don't want to go with wine or a port, GoGrid might be your provider of choice, until MS has their own offering, or others step up.
you mean the definition of cloud computing is still cloudy?
My Magic 8-Ball(tm) says the outlook of cloud computing is cloudy.
The problem with Google app engine "quota" based free account:
[excerpted from longer article] ...
The problem is that they canâ(TM)t tell the difference between an infinite loop and a routine (in your code) that will eventually terminate (on its own.) So they give you a quota to use in a time window of a day or a month (however theyâ(TM)ve decided to define the quota window) and your app can burn as many cpu cycles at any given time, i.e. burst, up to the remaining quota for the given time window (hopefully, without depleting all your quota before the end of the time window.)
For a âoemassively scalableâ cloud computing solution it is very dumb for Google to offer a quota-based âoefreeâ account (that maxes out and leave you hanging) unless they are working on a way to examine your code for you (before you run it) for potential infinite loops and badly designed queries.
This may sound banal but the reason for enforcing a quota for the free account is because they want to give you a limited (not infinite) free cpu cycles. The problem with the quota, however, is that once you deplete it, you will have to wait for the next quota window, e.g. 12 hours, a day or a month.
The way to handle this practically is for Google to do away with the quota model (e.g. bill you for all usage) or come up with sophisticated statistical techniques to examine your code for potential infinite loops or costly queries (remember that Turing already proved that you canâ(TM)t always deduce that logically) before they run it. Without such ability, applied in reliable manner, the only thing they can do when your app inadvertently hits the quota for the given time window is to disable your application until the start of the next quota window, which totally sucks.
**********
Is this an obvious Google Duh! moment?
More: http://evolvingtrends.wordpress.com/2008/07/23/google-app-engine-threat-or-opportunity/
If you have a lot of servers already, and would like the scalability of "cloud computing" where you easily add more cores/ram/disk, check out this demo by 3tera.com.
I'm sure it is expensive, and I've not talked to them yet, but it sure would be great to "draw out" a new Dev or QA environment when you need one. Then when the project is complete, you can recycle those resources back into the cloud. If your Production system needs more cores, simply add them.
[1] Amazon still continues to have stability problems. An 8 hour outage for S3 - a service that's supposed to provide 99.99% up time? It seems like Amazon is still working through some of the problems associated with scaling on a massive level. Google has already solved those problems. [2] If your user base is not all in the US I don't think amazon can compete in any way with google's network of data centers. [3] Amazon aws has a different pricing model and approach than App Engine. But I think google will ultimately be cheaper. If you're only using ec2 then it might be harder to call. But certainly if you add in S3 + SimpleDB, Amazon looks to be more expensive. [4] Amazon provides a platform that allows you to scale, but does not do all the scaling for you. If you just want to create an app and concentrate 100% on features App Engine wins hands down. [5] I personally prefer the App Engine environment - python rocks, the console rocks, it's easy to write code, it's easy to deploy, it's easy to role back to previous versions etc. I still find it hard to believe that amazon doesn't have better tools for their services. There's a bunch of great tools created by the community for AWS. But it seems like at every level Google tried to make it as simple as possible to deploy massively scalable apps. Amazon provides some services and hopes the community does the rest.
All four services reviewed operate out of US data centers. That is a serious issue for overseas users of the cloud - both in terms of network latency/bandwidth and data jurisdiction (c.f. Canada's position on information sent across borders).
EU users may want to consider ElasticHosts or FlexiScale - both of which are UK-based.
'the cloud' is old networking/telephony terminology. Describing interconnection of two sites, you'd diagram the systems at either end, and their local links, but once the links enter the network you don't know or care how the routing happens (generally). This part of the network was 'the cloud' (and was diagrammed as a cloud).
By inference, cloud computing would be where you know the computation is happening somewhere on the network, but you neither know or care exactly where.
See this thread back in 1995 -
http://groups.google.co.uk/group/bit.listserv.techwr-l/browse_thread/thread/d6384bd640275c43/14da0963ed1c294a?hl=en%0Eda0963ed1c294a
Or the first diagram in RFC 1587 (1994):
http://rfc.dotsrc.org/rfc/rfc1587.html
I joined a telecoms company the year before that and the term was in use there, can't vouch for earlier.
From reading the article (sorry about that) I get the impression the author doesn't really understand Google's AppEngine offering.
Yes, these services let you pull more CPU cycles from thin air whenever demand appears, but they can't solve the deepest problems that make it hard for applications to scale gracefully
AppEngine does exactly that (or at least tries to). In order to do so, it takes away many features that you might consider essential, and forces you to organise your code and your database in very specific ways. But if you can accept all of these limitations, and learn to work with them rather than against them, your application automatically becomes unbelievably scalable.
This is all IMHO of course, as a developer who's got into playing with AppEngine in his spare time simply because the whole thing's so damned cool. I'm no expert yet, I'm not doing this as a job, and I've no experience of any of the other services mentioned, so take my opinions with a grain of salt.
Cloud computing? lots of load on the server, better make it a beowulf cluster... Oh wait, why not get a bunch of PCs running Plan9? If you have gigabit ethernet, it'll automagicaly include a big honkin RAID array, and it'll be multicore. the separet CPUs don't have to be powerfull.
I know full well that tobacco is bad for you, so I smoke weed with crack
While this may over-simplify the Cloud Computing definition, I tried to explain and categorize the different segments within with the idea of the "Cloud Pyramid." You essentially have 3 segments: Cloud Applications, Cloud Platforms and Cloud Infrastructure. It's all broken down here.
Technology Evangelist for GoGrid/ServePath Twitter: http://www.twitter.com/hightechdad
By that definition, EC2 is not cloud computing. It's not a "just works" kind of system. Amazon uses the word cloud as an alternative to group or set. I.e. they have a group of computer instances that you can use. If you need more power, you can request it. If you have more power than you need, you can release some back.
EC2 is a platform on which someone could build such an abstraction service (for example, someone implemented the Google App Engine API on EC2). EC2 itself does not provide that abstraction.