Slashdot Mirror


Amazon's Cloud May Provision 50,000 VMs a Day

Dan Jones writes "It has been estimated that Amazon Web Services is provisioning some 50,000 EC2 server instances per day, or more than 18 million per year. But that may not be entirely accurate. A single Amazon Machine Image (the virtual machine) may be launched multiple times as an EC2 instance, thereby indicating that the true number of individual Amazon servers may be lower, perhaps much lower, than 50,000 per day. So, even if it's out by a factor of 10 that's still 1.8 million VMs per year. Is that sustainable? By way of comparison, In February of this year, Amazon announced S3 contained 40 billion objects. By August, the number was 64 billion objects. This indicates a growth of 4 billion S3 objects per month, giving a daily growth total of about 133 million new S3 objects per day. How big can the cloud get before it starts to rain?"

32 of 122 comments (clear)

  1. tag: Dumbquestion by drinkypoo · · Score: 3, Insightful

    How big can the cloud get before it starts to rain?"

    Clouds don't work like that, they let go their rain when they enter a pressure zone where they can no longer hold water.

    If Amazon is centrally dispatching, then they deserve to fail. If not, then there's no reason why getting larger would necessarily cause any particular problem.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    1. Re:tag: Dumbquestion by Enry · · Score: 2, Interesting

      This. Maybe instead of atmospheric clouds, they're talking about the Oort Cloud.

    2. Re:tag: Dumbquestion by Enry · · Score: 2, Funny

      If EC2 has the same uptime as bits of that cloud destroying life on earth, I think it'll be around for a while.

      And if one does hit us, I guess it won't matter anyway.

  2. Please stop... by broken_chaos · · Score: 4, Insightful

    Cloud is bad enough. Starting up bullshit analogies with clouds and rain just muddy whatever you're talking about far, far more than is necessary.

    1. Re:Please stop... by Josh04 · · Score: 5, Funny

      I agree, the rain does muddy the waters somewhat. Not to mention the flood of comments deriding it as such.

    2. Re:Please stop... by suso · · Score: 3, Funny

      Oh, stop raining on everyone's parade.

    3. Re:Please stop... by smaddox · · Score: 2, Funny

      I still remember when AOL signed-up too many customers, and the result was a service that was slow and unresponsive.

      Yeah, I remember their grand opening, too.

    4. Re:Please stop... by moon3 · · Score: 5, Insightful

      Managers love this kind of terminology, because from their point of view Internet just 'happens' somehow, they do not have a real clue how, but the cloud fits perfectly into this kind of thinking. That is why cloud hosting is so popular, they just order 4GB/100Mbit/s cloud and the hosting company creates one for them. They do not have to worry about setting up DNS, SQLs, multiple servers, domains, SMTPs and get schooled by some lowlife nerdy IT guys, they understand the dumbed down cloud interface well enough themselves, they just interact with the web interface and are happy it is all working for them.. somehow, somewhere, in the cloud.

    5. Re:Please stop... by Gilmoure · · Score: 2, Interesting

      Oh man, I was in art school in early 90's. All those AoL CD's were great for material for art projects and stuff.

      --
      I drank what? -- Socrates
    6. Re:Please stop... by slim · · Score: 5, Informative

      Managers love this kind of terminology, because from their point of view Internet just 'happens' somehow.

      And cloud computing makes them right. You pay some money, and the entity you're paying the money to, makes it happen.

      Just like when I buy a tin of soup from a supermarket, I don't need to understand anything about the supply chain that got it there.

  3. How is using so many VMs more efficient? by Viol8 · · Score: 3, Insightful

    I've never really understood the fuss around VMs. Sure , they're useful if you want to test run an OS install or run a different OS on top of another. But otherwise whats the point? Instead of having app + OS you end up with app + VM + OS so how exactly is that benefiting anyone other than the power company for the extra electricity used?

    1. Re:How is using so many VMs more efficient? by SappoMan · · Score: 5, Informative

      Ok, you don't work in IT right? At least not on the admin side.
      VM are mainly about server consolidation. That means that given the fact that servers are usually under utilized you can put quite a number of VM per core. Usually for server workloads the number is around 2: 2VM * 4 cores * 2 cpu (typical blade) yields 16 VM. You see, in the end the power company gets paid only for a physical server every 16 OS instances. Not bad.
      Server consolidation is not the only reason you use virtualization. Other issues you can solve are: high availability and fault tolerance, quick deployment of new servers, hardware abstraction and many others

    2. Re:How is using so many VMs more efficient? by alen · · Score: 2, Informative

      we have used VMWare for a few years. Our devs would write a java app and it would require it's own server but it would use maybe 20% if not less of the resources. Now we just provision a VM. less server clutter in the datacenter and smaller electricity bills. Also great for DR. we ship the entire VM to a DR site so all we have to do is bring it up, change the IP and we're ready to go. otherwise we would spend days trying to configure all the apps, find the source, etc.

      i have my own server i used to test a SQL Server migration. we went from a single DB/Web Server for reporting services to a clustered DB and a scale out web farm. needed VM's to test out a long list of things since the cluster was already in production with another instance running on it.

      For production physical hardware is still cheap for heavy duty stuff. HP Proliant servers are dirt cheap and they scale out to 144GB for 1U models. Next year it will probably double. and with SQL and Oracle there is no need for VM, since you can just create new instances and not worry about hypervisor performance issues.

    3. Re:How is using so many VMs more efficient? by reashlin · · Score: 5, Informative

      Its more than that.

      Most machines run at around 10% of all possible utilisation. Often web servers will run at less than this. In a datacenter you have two options a) run hundreds of very slow cheap machines each running one instance of your webserver. b) consolidate lots of machines onto one powerful box and running it at 70-80% utilisation.

      Option b) has the advantage that should a website get hit heavily (maybe because its been linked too on /.) then you still have the beefy hardware to cope with it. You will also find heating bills go down. You'll usually even get the costs of the hardware down as well.

      If your still not convinced then look at the work by most VM software manufacturers who are making it so the VM can move around on physical hardware. Now if your hardware fails - the VM and OS does not. It just moves off somewhere else and continues to operate with little/no drop in performance or uptime.

    4. Re:How is using so many VMs more efficient? by teshuvah · · Score: 5, Informative

      I've never really understood the fuss around VMs. Sure , they're useful if you want to test run an OS install or run a different OS on top of another. But otherwise whats the point? Instead of having app + OS you end up with app + VM + OS so how exactly is that benefiting anyone other than the power company for the extra electricity used?

      Because for the most part, most servers don't run anywhere near full capacity (and if they do, then they are probably not good candidates for virtualization, except possibly for high availability purposes which I will go over in the second paragraph). I forget the study but I read once that on average a typical server sits at 5-15% utilization. So the idea behind products like VMware ESX is that if you need 5 unique servers, instead of buying 5 servers at $5,000 a piece, you buy 1 server for $5,000 + 1 $5,000 VMware license, and run the 5 virtual servers on that. So you spend $10,000 instead of $25,000, and your footprint is 1/5th of what it was before, meaning less racks, less cooling, less power, etc. And the numbers I gave are very conservative. A lot of people do 10-20 VMs per server easily.

      So cost, power, and cooling issues aside, there are other issues. In a typical server environment, if a physical server suffers from a catastrophic hardware failure, that server is down until someone can walk over and swap the hardware. With VMware, if a VM is running on a server and that server fails, the VM is cold booted on another ESX server automatically, and is typically up in 30-60 seconds. With the newest release of ESX server, called vSphere, they take it a step further. You can optionally choose to have A VM mirror itself on to another physical ESX server. So in the event of a hardware failure, the VM keeps running on the mirrored host. And then, it becomes the primary VM and sets itself up to mirror automatically on another ESX server. So you have ZERO downtime and the app re-mirrors itself. These are just some of the many useful features in VMware.

      And no, I do not work for VMware. I am a contractor for the Air Force and over the past 2 years I have converted almost 200 physical servers to VMs. We are a relatively small program, but our projections show that we will save $2,000,000 over 10 years just on the cost of servers (and yes, i have added in the cost of VMware licenses and support into that equation), and that doesn't even account for power and cooling savings. We've gone from almost 200 physical servers distributed over 7 full racks racks down to 28 servers in 2 racks (2 racks only because they are two separate facilities. Each rack only contains a single HP c-class chassis)

      I think the real question is, how can you NOT understand the fuss around VMs?

    5. Re:How is using so many VMs more efficient? by hodet · · Score: 3, Informative

      It makes perfect sense. His clients want a dedicated host for their server. 10 clients, 10 virtual servers on one powerful box instead of 10 servers running at minimum capacity. More profit for parent. Data Centers are using virtualization big time because it saves money. Very easy to move the guest OS around if needed, even geographically.

    6. Re:How is using so many VMs more efficient? by timeOday · · Score: 2, Insightful
      The point is that multi-tasking operating systems already support server consolidation by protecting processes from each other so you can run multiple processes on a host safely. And they do it in a FAR more efficiently than VMs, which have an entire OS instance for every process, and memory partitioned statically between them.

      However, the OS doesn't quite finish the job. The need for VMs arises from design shortcomings at the OS level and above. Here are a few:

      1. You can't install an app and all its dependencies and configuration by simply copying from one host to another. On Linux especially, apps have an insane number of dependencies
      2. Process migration
      3. Using certain port numbers for certain services (most services don't portmap, and firewalling rely heavily on port number assumpions)

      It would be nice to fix these at the OS level instead of just piling one protected memory mechanism atop another (java VM atop a virtual machine atop a protected memory CPU architecture and OS).

    7. Re:How is using so many VMs more efficient? by jcnnghm · · Score: 2, Informative

      Security/Separation of Duties.

      --
      You don't make the poor richer by making the rich poorer. - Winston Churchill
    8. Re:How is using so many VMs more efficient? by bertok · · Score: 4, Interesting

      I thinl you're missing my point - why have multiple OSes if they're all the same type of OS and the apps could all happily run on the same OS instance? As for deployment - have you never heard of a tarball? OS dies - take app tarball to new server , untar. Hows that different to copying a VM machine file over?

      In the real world, people run apps like Exchange or Oracle, which take hours to install to a vanilla state, and that's not counting the potentially terabytes of data associated with them.

      Even the most primitive "tar ball" Linux app will have dependencies on the OS, and those can and will eventually break, unless you freeze your OS version forever. If you have enough apps and servers, that will become a nightmare to manage. Do I upgrade or not upgrade? Will this patch or that patch break one of the apps? This is how people end up running Linux 2.2, or 32-bit Windows on 64-bit platforms, because migrating 1 app is hard enough, but migrating a server with 20 apps on it is a recipe for disaster.

      Virtualization lets you quite literally drag & drop a running host OS from server to server. During maintenance time, that's like magic. No more 3am hardware replacement jobs for me! You can clone a machine while it's running, isolate the clone onto a virtual network, and test an upgrade without interrupting users. Sure, you can do that with most backup & restore tools, but VM platforms do it quicker, and with fewer admin steps. You don't even need spare hardware.

      I once replaced every single hardware component of a running VM farm, servers, cables, switches, even the SAN, while it was running. During the day. Zero outage, no packets lost, no TCP/IP connections closed or user sessions disconnected. We even had terminal server (Citrix) and console (SSH) users on. Not one user even noticed what was going on. I'd love to see you try that with 'tar'.

    9. Re:How is using so many VMs more efficient? by bertok · · Score: 3, Informative

      Sorry , that makes no sense. By definition you could do it on the same hardware without a VM unless your VM somehow magics processing power out of the ether.

      Except that unless you have a magic crystal ball, you'll never be able to predict application load ahead of time. Hence, some servers will be underutilized, and some will be sitting at 100% half the time. The only alternative is to install every application onto every server you have, and load balance everything - but that requires that every app is compatible with every other app, and that every app can operate as a cluster. In practice, that's impossible for typical businesses.

      What the latest virtualization platforms do is load balance, on the fly. A large VMware cluster will analyze the load pattern and redistribute virtual machines around the cluster to balance things out, so that each host is evenly loaded. I've seen clusters set to an average of 70% CPU load, and it was just fine. If one host starts heading towards 100%, a few VMs are shuffled around until the load is evened out again. Users can't really tell the difference between, say, 20% and 70% load. It's only at 90% or higher that you get contention and increases in response latency. It takes about 5 seconds to move a VM, but the actual outage is only a few milliseconds, if that, so users never notice.

      One thing I noticed with VM deployments is that most apps get faster on less hardware. This is counterintuitive, but I've seen it before in well designed Terminal Server / Citrix deployments. The basic concept is that you can afford much better hardware if you need less of it. You can buy beefier servers, 10Gb ethernet, SAN storage, etc... When 1 app needs lots of power, it gets it, and then it gives up its share when it doesn't to other apps that do.

      So yeah, in a sense, virtualization does magic processing power of the ether, because it actually lets you use the processing power you paid Intel or AMD thousands of dollars for.

    10. Re:How is using so many VMs more efficient? by alen · · Score: 2, Interesting

      having one app conflict with another app. 10 years ago we had a few apps. today there are too many to count and constant point releases where minor functionality is added by user request or small bugs fixed.

      and it's not just java apps. weblogic instances, other apps we might buy or code internally. then there is QA since they need everything production has. Moving QA to VMWare was one of the first things we did when we bought it. the QA and Dev SQL servers are still physical, but a lot of their apps are now virtualized

    11. Re:How is using so many VMs more efficient? by slim · · Score: 2, Insightful

      So use 1 server and have 10 client logins on it FFS.

      1 client wants RHEL 4.
      1 client wants RHEL 5.
      2 clients want Windows Server, both want a weekly reboot, but during different maintenance slots.
      2 clients want stable Debian, but one wants a weekly 'apt-get dist-upgrade', the other wants it monthly ... etc.

      Give each one a VM, and you can deliver all this on one physical machine very, very easily.

    12. Re:How is using so many VMs more efficient? by slim · · Score: 3, Insightful

      When did installing multiple apps on 1 server go out of fashion?

      When it became clear it's a management headache.

      "Hi it's ops. You know your foo server sits on the same box as the bar server? Yeah, well the bar guys have found out they need a kernel with a higher filehandle limit, so we're going to be rebooting that box. You'll need to tell your users about the outage. Oh, and you'd better have QA test the foo server with the new kernel too."

  4. Re:ok did a manager write this?! by RealityProphet · · Score: 4, Interesting

    who cares how many potential VMs the "cloud" can host. its methodone for most end users/devs real problems: inefficient code. the "just pitch machines at it until it runs fast!" mentality will catch up to us.

    That's not true. We use Amazon's cloud to host some of our servers. The reason we do it is for two main reasons. (1) We don't need to worry about equipment maintenance. Let me repeat that lest you think its not a big deal: We don't need to worry about equipment maintenance! (That is a big deal when you leave your basement but don't necessarily have a dedicated IT staff). (2) We are in a rapid growth phase. We cannot estimate well enough what are computing needs, our storage needs, are going to be 1- 2- 6- months down the road. We also don't have $50k to drop on equipment and storage that may be utilized 6 months from now, but we sure as hell know if we bought it now it wouldn't be used immediately. Amazon's cloud makes it trivial to keep up with our growing demand without paying up front for it. Sure we pay more to "rent" the stuff from Amazon, but its simply the big(O) argument: Amazon's pricing scales worse than the classic alternatives, but the constants out front are tiny.

  5. Re:What is the rain in this analogy? by stressclq · · Score: 2, Interesting

    Its too early to predict if the Amazon cloud will do anything meaningful or if its going to be a spectacular failure.

    Considering 64 billion objects and counting, if the latter is to happen it's bound to give a whole new meaning to "when it rains, it pours".

  6. 50k VMs/day is not THAT much... by nweaver · · Score: 3, Insightful

    Lets give a 12 hour lifespan, and say 25K VMs at the same time.

    At 5 VMs/physical host (I suspect it is MUCH denser actually), thats only 5K servers. At 50 servers/rack, its 100 racks.

    Or, in translation, not THAT much.

    --
    Test your net with Netalyzr
  7. Re:ok did a manager write this?! by commodore64_love · · Score: 3, Insightful

    So to use a car analogy (cough)

    - It's the same reason why people lease cars instead of buying them. It's cheaper in the short term, and easier to come up with $300 for rent than $20,000 for purchase. Plus adding extra cars as new employees join the company is trivially easy.

    --
    "I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
  8. I call shenanigans by Anonymous Coward · · Score: 3, Interesting

    My company tried to provision 10,000 amazon instances to perform scalability testing of our software that runs on many computers. The math was simple - 10,000 servers * $0.15 / hour = $1,500 / hour for testing. We liked the multiple OSes & versions (Linux - Redhat, SLES, Windows - 2000, 2003, 2008?) and software stacks (mysql, apache, websphere, sql server, iis, etc...) that we all available out of the box.

    However, if you need more than 20 servers, you have to fill out a form. A sales rep and tech guy called to discuss our needs. It turns out that they could only handle around 1000 instance request across all data centers unless we "reserve" the machines at $300 / each, which blew the math - 10,000 servers * $300 = $3,000,000 to start.

    Looking at the article, it is likely that people are re-requesting the same machine be started & stopped multiple times per day - 50,000 is probably off by an order of 10.

    1. Re:I call shenanigans by AlXtreme · · Score: 2, Insightful

      Even if it was $300/machine with 20VMs/machine it would be quite costly to reserve 500 machines.

      They raise the price because they can't scale that much on a dime. They probably have to add hundreds of machines a day in order to keep up with the demand for EC2 instances, you can't expect them to keep thousands of machines ready in case someone wants to figure out how high the cloud really scales. It would simply cost too much.

      No matter the cloud-hype, in the end Amazon and every other hosting supplier have to limit the amount a customer can provision. Want to go above that limit? No problem, but we'll have to hook up some additional machines in advance.

      The cloud is a leaky interface.

      --
      This sig is intentionally left blank
  9. Stock Exchange by MyDixieWrecked · · Score: 2, Interesting

    I went to an Amazon's AWS talk in NYC a couple months ago where they brought some start-ups in to talk about their projects, the cloud and how the cloud helped them build their applications faster and better. During the opening talk, the speaker showed some use-cases, one including the New York Stock Exchange and how, at the closing bell, they provision over 3000 EC2 instances to crunch numbers overnight to be ready for the next morning.

    A guy from a startup that I was talking to before we were seated was talking about how his company keeps between 5 and 10 instances up all the time for their application (dynamically bringing them up and down to scale with demand) and how they frequently had 4 and 5 sets of these servers running on the side for testing (20-40 instances at a time). He was talking about the metrics they were using to keep track of their use and how it was flawed due to the fact that they had hundreds of instances a day going up and down all the time.

    Just because 50,000 instances are started per day doesn't mean that those 50,000 instances are running for any period of time. I frequently bring up an instance, tweak some things, create an image, then bring it down... or bring up an instance to test something for 20 minutes, then bring it down. EC2 has really benefitted my QA/Testing/Experimentation in that I really have an unlimited pool of resources to play with. It's a much more robust system than I have at home with VMWare... vmware was a gamechanger for me since before that, I had 2 physical servers at home and stacks of 40GB and 60GB HDs with multliple versions of OSs on them.

    Of course AWS isn't for everyone. EC2 can be expensive for what they offer and the biggest advantage to AWS's services are that they are on-demand and work really well with applications that need to scale up AND down in real-time. If you've got an application that doesn't require to-the-minute scaling responses, it's less expensive to get a physical dedicated server with Xen on it and create your own virtual infrastructure... although if you don't have the skills or time to learn the tools, then AWS offers a much better learning curve.

    --



    ...spike
    Ewwwwww, coconut...
  10. Re:ok did a manager write this?! by afidel · · Score: 3, Insightful

    But it's even better than a car lease, because you can end the lease on the VM with no penalty. If you have a really big batch job that needs to run once a month then you just spin up the VM's for the duration of the batch job paying for your usage and them deprovision them for the rest of the month.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  11. Define "Objects" by Dersaidin · · Score: 3, Informative
    Objects?

    "Objects" doesn't mean VMs, objects can be files, processes, etc.