Slashdot Mirror


Why Auto-Scaling In the Cloud Is a Bad Idea

George Reese writes "It seems a lot of people are mistaking the very valuable benefit that cloud computing enables — dynamically scaling your infrastructure — with the potentially dangerous ability to scale your infrastructure automatically in real-time based on actual demand. An O'Reilly blog entry discusses why auto-scaling is not as cool a feature as you might think."

124 comments

  1. I don't think so by Yvan256 · · Score: 5, Funny

    I think auto-scaling the clouds based on actual demand is a really great idea. I think farmers would really like that feature, in fact.

    Wait, what clouds?!

    1. Re:I don't think so by ZarathustraDK · · Score: 3, Funny

      Wait, what clouds?!

      Cumulo-mumbo-jumbo-nimbus clouds maybe?

      --
      If you quote this signature there'll be 72 copies of Windows ME waiting for you in Heaven.
    2. Re:I don't think so by larry+bagina · · Score: 3, Funny

      Could be a script that logs in and then posts anonymously. That's what I'd do.

      Disclaimer -- I didn't do that.

      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    3. Re:I don't think so by mrhthepie · · Score: 1

      So you'd do anything for spam, but you wouldn't do that?

    4. Re:I don't think so by nobaloney · · Score: 1

      I was thinking that auto-scaling in the clouds could mean some mighty big autos ... much too big to be driven on the highways.

    5. Re:I don't think so by u38cg · · Score: 1

      I just find it funny that an article on how to deal with slashdotting is on slashdot. Sadly, it has failed to fall over, creating what would have been a fine piece of irony.

      --
      [FUCK BETA]
  2. Like cellphones by Tablizer · · Score: 5, Insightful

    Without a hard-limit, some people run up big cell-phone bills. If you are forced to stop and plan and budget when you exceed resources, then you have better control over them. Cloud companies will likely not make metering very easy or cheap because they *want* you to get carried away.

    1. Re:Like cellphones by Naturalis+Philosopho · · Score: 1

      Read GP's subject line. It's just like cell phone companies.

      So, both with Cell phones and with cloud resources, heck with life in general, it's best to plan ahead? Who'd a thunk it? ~

    2. Re:Like cellphones by Tablizer · · Score: 1

      Agreed, I didn't say that was a different aspect (although perhaps I could have worded it better in hindsight).

      Note that I am not against auto-scaling in general. It's just that it takes a different kind of discipline that one may not be ready for and may not quite live up to the vendor's hype.

      Older technologies are often well-understood by staff and they have a feel for using them. Often newer technologies take a while to know how to use and manage effectively. Even if they *are* an overall incremental improvement, this only after you master them. It's rare that a new technology is so significantly better than the old that one should jump ship immediately. Ease into it gradually.
                 

    3. Re:Like cellphones by narcberry · · Score: 1

      Of course cloud companies want you to need to scale.

      Also, most successful companies want to scale too. Doesn't the customer and provider win in the scenario? As long as growth = profit for the customer, I mean.

      --
      Modding me -1 troll doesn't make me wrong.
    4. Re:Like cellphones by Cylix · · Score: 2, Insightful

      Actually, metering is cheap and easy, simply because they *need* to meter your traffic. Companies with infrastructure requirements and not a great deal of dumb users will generally have to be honest to keep your business.

      Loyalty is based on performance and meeting customer expectations.

      Phone companies get away with this crap because they are either a monopoly or engage in lengthy customer lock in. It also doesn't help that it is pretty much the norm to nickel and dime the customer.

      Ec2 and other retail outlets you can simply walk away from if you are unhappy. I'm assuming other cloud operators work in a similar fashion.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    5. Re:Like cellphones by lysergic.acid · · Score: 4, Interesting

      i think the author's point is that dynamic scaling should always be planned; partly because it results in better understanding of traffic patterns, and thus better long-term capacity planning, and partly because you need to be able to distinguish between valid traffic and DDoS attacks. still, i think the author is overstating it a bit. one of the main draws of cloud computing to smaller businesses is the ability to pool resources more efficiently through multitenancy, part of which is precisely due to auto-scaling. without the cloud being able to dynamically allocate resources to different applications as needed in real-time (i.e. without human intervention), there isn't much of an advantage to sharing a cloud infrastructure over leasing dedicated servers.

      for instance, let's say there are 10 different startups with similar hosting needs, and they can each afford to lease 10 application servers on their own. so using traditional hosting models they would each lease 10 servers and balance the load between the them. but after a few months they realize that 75% of the time they only really need 5 servers, and 20% of the time they need all 10, but an occasional 5% of the time they need more than 10 servers to adequately handle their user traffic. this means that in their current arrangement, they're wasting money on more computing resources than they actually need most of the time, and yet they still have service availability issues during peak loads 5% of the time (that's over 2.5 weeks a year).

      all 10 of these startups share a common problem--they each have variable/fluctuating traffic loads severely reducing server utilization & efficiency. luckily, cloud computing allows them to pool their resources together. since the majority of the time each startup needs only 5 servers, the minimum number of virtual servers their cloud infrastructure needs is 50. and since each startup needs double that 20% of the time, 10 extra virtual servers are needed (shared through auto-scaling). but since each startup needs more than 10 servers for about 2.5 weeks each year, we'll add another 15 extra virtual servers. so all in total, the 10 startups are now sharing the equivalent of 75 servers in their cloud.

      by hosting their applications together on a cloud network, each startup not only has their hosting needs better met, but they also stand to save a lot of money because of better server utilization. and each startup now has access to up to 30 virtual servers when their application requires it. this kind of efficiency would not be possible without a cloud infrastructure and auto-scaling.

    6. Re:Like cellphones by mabhatter654 · · Score: 1

      Along the same lines, the whole cloud computing thing got started really because companies like Amazon really went out and bought all the servers they needed to hit their peaks.. which is a lot of overkill 50% of the time. So they started renting it out to little people.

      The thing the article's author misses is that regular small-medium businesses don't want to be web experts and they don't have hundreds of thousands of dollars to pay for web server farms or analysis. They want to get some usage numbers for now, maybe add some sales functions and put the web site someplace they don't have to worry about it. Most sites are fairly low volume but would like to be bigger. Of course professional web architects think everything should be planned and measured within an inch of it's life. But in reality the web is just another media source for most companies, as long as it's getting the message to the users, it's working.

      As far as slashdotting or DDOS attacks I would expect some planning for such things. First, DDOS attacks should be dealt with by the host, not the website. In a cloud there's no such thing as attacking only one site because that takes away from the others, I'd expect the cloud maintainer to deal with that. As far as slashdotting, I think cloud sites could have a way of dealing with that gracefully. His worry about losing 10 minutes really isn't founded because that's not what the slashdotting curve looks like when seen from other sites. It's a ramp up over an hour or two, and has a peak, then a second smaller peak a day later. This is exactly what cloud computing is designed to handle. I think cloud computing could handle it even better because only the slashdotted part of the site would need to be replicated which should be easy to figure out how to build the cluster in such a manner.. again, once Amazon figures that out the customer will upload the site in the way specified and not worry about details.

    7. Re:Like cellphones by enovikoff · · Score: 2, Interesting

      As a cloud computing provider, I actually have no interest in having my customers suddenly run up huge bills. The reason is that as the article said, something is most likely wrong somewhere, which means that as their services provider, I'll also be responsible for figuring it out :) I can't speak for Amazon, which has a more hands-off model, but my success is invested in the success of my customers, so I won't sit idly by while they waste their money. However, looking at my company's balance sheet, we make our money off of base load, not peaks. Unlike what one of the other posters said, we can't average the peak load across customers since most customers have peaks at the same time, so accommodating peak load is more of a money-losing proposition since the bulk of those servers lie idle much of the day. In a truly round-the-clock (geographically distributed) cloud operation, this might be true, but even Amazon, which makes you choose the continent you want to run your cloud servers in, still has to hold a lot of reserve capacity (which is built into their rates) to accommodate the usual twice-daily peak loads. For web sites, a peak load that is many times the base load usually indicates something is wrong with the business model as well as the software, since SaaS providers also can't make any money off short bursts of usage. In many cases, peaks that last less than the provisioning time of a new instance (which is typically no less than a few minutes because of the time to load the instance's memory from storage) have to be handled differently anyway, either with more base allocation or for example with a queue of work to be done and notifications to customers when that work is completed.

  3. Author makes some valid points, but... by Anonymous Coward · · Score: 4, Insightful

    THe author states that one reason he doesn't like autoscaling is because it can take a while to take effect. Thats bad technology, waiting for someone to come along and improve it.

    He also says he doesnt like autoscaling even with limiters. Autoscaling with limiters makes sense to me, especially if the limits are things along the line of 'dont spend more than XXX over time Y'.

    Finally, not using autoscaling because you might get DDoS'd is just stupid. You lose business/visitors. Thats worse than paying more to avoid being taken down, because your reputation gets hurt AS WELL AS losing you business.

    1. Re:Author makes some valid points, but... by Stile+65 · · Score: 1

      There are better ways to deal with too much traffic than auto-scaling.

      One way is to use caching intelligently. This will allow you to use much less in the way of disk I/O resources, so your bottleneck will be one of {CPU, RAM, bandwidth}. CPU and RAM are very cheap for the amount you need to meet any reasonable demand, compared to I/O throughput. Bandwidth in a cloud (specifically EC2/S3) is virtually unlimited, though you'll pay for it. S3 has a CDN-like feature now too, so you can save money if you put your images on S3 and register that bucket for the CDN-style service.

      Another way to scale well - specifically when talking about DDOS - is to use firewalls intelligently. Again, assuming bandwidth is not an issue since we're on EC2, you can use pf instead of iptables to throttle individual IPs to a maximum number of HTTP connections before you drop SYNs from those IPs. You can do this either right on your webservers or put a dedicated load-balancer/firewall type VM out in front of the webservers. (Note: I'm not sure exactly how the LAN side of this stuff is configured, never having actually used EC2, but that's how I'd do it with physical or virtualized boxes on my own network.)

      --
      I claim first use of "Error No. 0B" - or "No. 0B error." It'll be the new ID 10T!
    2. Re:Author makes some valid points, but... by narcberry · · Score: 2, Insightful

      He complains that 10 minutes for a computer to scale is too slow, then states

      Auto-scaling cannot differentiate between valid traffic and non-sense. You can. If your environment is experiencing a sudden, unexpected spike in activity, the appropriate approach is to have minimal auto-scaling with governors in place, receive a notification from your cloud infrastructure management tools, then determinate what the best way to respond is going forward.

      It's 4pm on a Saturday, and your site is getting hit hard. Rally the troops, call a meeting, decide the proper action, call Fedex to ship you more infrastructure, deploy new hardware, profit from your new customers, all the while laughing at the fools who waited 10 minutes for their cloud to auto-scale.

      --
      Modding me -1 troll doesn't make me wrong.
    3. Re:Author makes some valid points, but... by smack.addict · · Score: 1

      It's 4pm on a Saturday and chances are that your site is being hit hard either because you were being an idiot or because someone is engaged in an attack on you.

      If you plan properly, there are no sudden 4pm on Saturday spikes in traffic.

    4. Re:Author makes some valid points, but... by narcberry · · Score: 2, Funny

      Oh right, Al Gore internet rule number 1. Internet closes on weekends. Only hackers can visit sites, and only with malicious intent.

      --
      Modding me -1 troll doesn't make me wrong.
    5. Re:Author makes some valid points, but... by Cylix · · Score: 2, Interesting

      His complaint with auto-scaling was that if the org is doing their proverbial homework then and planning for additional capacity then they should not need it.

      There are times when traffic boosts come as a bit of a surprise. However, depending on size and free capacity some bumps should be able to smooth out.

      Another trick is to have the means to scale some functionality down to allow for additional traffic. Slashdot for instance used to flip to a static front page when traffic was insane.

      Personally, a very limited automatic scale to meet a few percentage points might not be a bad idea at least to create additional buffer for increased reaction time. Still, alarms should sound and I would think of this as a fall back option.

      All in all, I rather agree with his sentiment. Don't be lazy and don't waste cash.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    6. Re:Author makes some valid points, but... by Anonymous Coward · · Score: 0

      It's 4pm on a Saturday, and your site is getting hit hard. Rally the troops, call a meeting, decide the proper action, call Fedex to ship you more infrastructure, deploy new hardware, profit from your new customers, all the while laughing at the fools who waited 10 minutes for their cloud to auto-scale.

      The options are not "auto-scale the fuck out of it" and "run your own servers and don't plan ahead at all". Really, did you just skim the article as fast as possible, looking for a way to jump in and say, "See, see, this guy is dumb! Look how smart I am!"

      How about, traffic always spikes on sunday evenings for reason X. Either for 10 minutes (or more) every sunday, your shit goes down while the cloud autoscales up, or you actually plan ahead and increase capacity a half-hour before it hits and everything works great.

    7. Re:Author makes some valid points, but... by SanityInAnarchy · · Score: 1

      There are better ways to deal with too much traffic than auto-scaling.
      One way is to use caching intelligently....

      Yes. You could also rewrite your app in C, etc... Point is, sooner or later, you're going to run into a problem which requires you to scale.

      And it would be pretty cool if, on being Slashdotted, you could have your auto-scaling tool kick in and have your site actually be live while you look for things to tweak (caching, etc) -- but not with the purpose of "getting the site back up", but rather, "saving some money".

      I suppose it depends what kind of business you're in -- whether you can afford to take that downtime.

      Now, if you can engineer it to where you never need auto-scaling, great. You can leave the auto-scaler running, and it will simply never kick in. The question is, which scenario is worse when you're wrong -- site down, or too much money?

      I'm not sure exactly how the LAN side of this stuff is configured, never having actually used EC2

      All instances are behind a giant NAT. And while all instances have a public IP address, there is also a user-controlled firewall in place. You can run iptables or whatever behind that, certainly.

      All instances are reachable from all other instances via their internal IPs -- for free, within the same availability zone -- subject, again, to the user-controlled firewall, IIRC. It's trivial to, for example, only allow connections from other instances you control to the LAN interface. (They are also reachable via their external IPs, but that's both through the firewall and considered cross-zone traffic, so it costs more.)

      So what you describe is certainly possible. The only question is whether your firewall instance would have enough bandwidth -- especially when talking about a DDOS.

      --
      Don't thank God, thank a doctor!
    8. Re:Author makes some valid points, but... by julesh · · Score: 1

      Finally, not using autoscaling because you might get DDoS'd is just stupid. You lose business/visitors. Thats worse than paying more to avoid being taken down, because your reputation gets hurt AS WELL AS losing you business.

      The solution to a DDoS is not to keep scaling up your hardware and hoping that the next server you add will be able to withstand it. It won't.

      The solution to a DDoS is to get on the phone to your ISP and ask them to step up the filtering on your server. They can limit the rate at which IP can initiate new connections, which will only marginally inconvenience your real customers, and will really limit the amount of resources the DDoS absorbs.

      Of course, you can only do this if you know you're under attack, and if your infrastructure is set to autoscale, you probably won't know. Until you receive the bill.

    9. Re:Author makes some valid points, but... by julesh · · Score: 1

      It's 4pm on a Saturday, and your site is getting hit hard. Rally the troops, call a meeting, decide the proper action, call Fedex to ship you more infrastructure, deploy new hardware, profit from your new customers, all the while laughing at the fools who waited 10 minutes for their cloud to auto-scale.

      Or, receive an SMS on your phone telling you that capacity is nearly exhausted. Think about it for thirty seconds; call somebody who's likely to be sitting in front of a computer (surely your company's IT staff is geeky enough that at least _one_ of them will be) and ask if it looks like increased demand or a DDoS. Then call your service provider and ask them to take the appropriate action (either add additional resources or limit connection rates, depending on which it is). Probably takes about the same 10 minutes, but is less likely to end up with a bill you can't afford to pay.

    10. Re:Author makes some valid points, but... by cecil_turtle · · Score: 2, Interesting

      Of course, you can only do this if you know you're under attack, and if your infrastructure is set to autoscale, you probably won't know. Until you receive the bill.

      Yes because if you happen to use some sort of auto-scaling system, be it at the cloud level or your own management system, it's very likely that you never thought to put in the same monitoring and alerting systems that you already had on your non-cloud, non-autoscaling systems thus ensuring that you will be blindsided by the scenario you just laid out.

      Or, you have more than two brain cells to rub together and you already had all of that in place and just pointed it to the auto-scaling cloud system enabling you to react the same way, except without the downtime in the middle.

  4. Want to be hip /.? by Daimanta · · Score: 5, Funny

    The blogosphere has disagreed with the use of web2.0 in the cloud. Sure, we all know that data is king and that's why we use software as a service nowadays with the web as a platform using AJAX and RSS extensively. This has helped to solve the challenge of findability since lightweight companies helps to connect user needs. The fact is that the long tail is part of the paradigm of user as co-developers in server wiki-like sites. Unfortunately this brings up the problem of ownership of user generated content. But I think that perpetual betas help the architecture of participation to stimulate web2.0. Interaction does make the experience good.

    --
    Knowledge is power. Knowledge shared is power lost.
    1. Re:Want to be hip /.? by Atriqus · · Score: 3, Funny

      Bingo!

      --
      Hey, look! It's Bono's brother.
    2. Re:Want to be hip /.? by pdbaby · · Score: 2

      It'd be funny if it wasn't so true :-(

      --
      Global symbol "$deity" requires explicit package name at line 2. - If only $scripture started "use strict;"
    3. Re:Want to be hip /.? by martin-boundary · · Score: 1

      Yeah, but it's true that it is funny.

    4. Re:Want to be hip /.? by FredFredrickson · · Score: 2, Funny

      Perfect, you've written my next proposal for my boss. Woot! He'll love it.

      --
      Belief? Hope? Preference?The Existential Vortex
    5. Re:Want to be hip /.? by Belial6 · · Score: 1

      Where can I invest!?!?!

    6. Re:Want to be hip /.? by Anonymous Coward · · Score: 1, Funny

      You're repurposing the synergy, aren't you? :'(

  5. Painful by narfman0 · · Score: 0

    "then determinate what the best way to respond is going forward." Sometimes some things are bettered not left not unsaid.

  6. Get Off My Lawn! by chill · · Score: 5, Funny

    Someone get this guy a cane to shake at the whipper-snappers. "In my day, you learned proper capacity planning or you didn't enter the data center!"

    It can take up to 10 minutes for your EC2 instances to launch. That's 10 minutes between when your cloud infrastructure management tool detects the need for extra capacity and the time when that capacity is actually available. That's 10 minutes of impaired performance for your customers (or perhaps even 10 minutes of downtime).

    Like, you could do it so much faster than 10 minutes without auto-scaling. Bah! If you've read The Art of Capacity Planning you would've mailed in the coupon for the free crystal ball and seen this coming!

    Properly used, automation is a good thing. Blindly relying on it will get you burned, but to totally dismiss it out of hand is foolish.

    --
    Learning HOW to think is more important than learning WHAT to think.
    1. Re:Get Off My Lawn! by VoidEngineer · · Score: 4, Insightful

      Properly used, automation is a good thing. Blindly relying on it will get you burned, but to totally dismiss it out of hand is foolish.

      First Rule of Automation: Automation applied to an efficient task increases it's efficiency; likewise, automation applied to an inefficient task will simply increase the problem until it's an all out clusterfuck.

      Second Rule of Automation: Automation applied to an effective task will be effective; likewise, automation applied to an ineffective task will still be a pointless waste of time.

      Or something like that. My eloquence appears to be -1 today.

    2. Re:Get Off My Lawn! by Aladrin · · Score: 2, Insightful

      And in addition, if that capacity is needed on my current servers (which aren't all cloud-y), how long does it take to scale up? I have to order a new server, install an OS, configure it, install all the software I need, test it, carefully roll it out.

      Can I do that in 10 minutes? Not a chance! If I did that in 10 hours it would be a miracle. 10 days is a lot closer to reality, for a true rush job.

      --
      "If you make people think they're thinking, they'll love you; But if you really make them think, they'll hate you." - DM
    3. Re:Get Off My Lawn! by TubeSteak · · Score: 2, Interesting

      First Rule of Automation: Automation applied to an efficient task increases it's efficiency; likewise, automation applied to an inefficient task will simply increase the problem until it's an all out clusterfuck.

      Last time I checked, most sites that get slashdotted are either some shiatty shared hosting or a dynamic page.

      Static pages & CoralCDN would keep a lot of websites from getting hammered off the internet.

      --
      [Fuck Beta]
      o0t!
    4. Re:Get Off My Lawn! by nine-times · · Score: 2, Interesting

      Yeah, it seems like is argument really comes down to a couple points:

      • Auto-scaling isn't fast enough- Apparently EC2 doesn't react quickly enough. To me, this seems to be a technical question as to whether auto-scaling can be designed to be reactive enough to be practical, and not necessarily an insurmountable problem with the concept of auto-scaling.
      • Auto-scaling might incur unexpected costs- The basic idea here is that, if you're paying a certain amount per measurement of capacity and it scales automatically, then your costs scale automatically too. This seems more like a contractual issue with your "cloud" service provider than an insurmountable problem with the concept of auto-scaling.

      So if someone offered a service where auto-scaling was fast, and there was some kind of limits on what you could be charged under what sorts of situations, would he still have a problem with auto-scaling? I was expecting something a little more absolute, like "there's a definite trade-off between security and accessibility", but it seems like he's saying something more like, "Right now there's no service that is offering auto-scaling services that are good enough."

    5. Re:Get Off My Lawn! by Cylix · · Score: 1

      The scaling logic is in your software. The cloud service shouldn't know best. In theory, a management and monitoring agent would dispatch an additional node and add that node to the pool.

      Since images can be templated it's really a matter of automating the deployment.

      Ec2 systems take time to transfer the image and initiate an instance. I do wonder about the 10 minute portion though. Since the last time I spun up a virt it was ready in about 5.

      I suspect this is his concept of instantiate a virt, deploy packages from s3 and add in the configuration magic. None the less, it appears to be based on an imaginary number since he doesn't use auto-scaling ;)

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    6. Re:Get Off My Lawn! by Anonymous Coward · · Score: 1, Informative

      Properly used, automation is a good thing. Blindly relying on it will get you burned

      Which is *exactly* what TFA says.

    7. Re:Get Off My Lawn! by SanityInAnarchy · · Score: 1

      So if someone offered a service where auto-scaling was fast, and there was some kind of limits on what you could be charged under what sorts of situations, would he still have a problem with auto-scaling?

      Probably. At the very least, he does mention the possibility of limits, and claims it doesn't address the core issue -- which is that if it's an unexpected spike, a human should look at that traffic to see if it's legitimate before spending money on it.

      I'd say, for most sites, it's probably worth it to auto-scale first, and then page the human. If it's not legitimate traffic, you can override it. If it was legitimate after all, 10 minutes to boot an EC2 instance is much faster than 10 minutes plus the time for you to answer your pager, look at the logs, and determine that it's legitimate.

      --
      Don't thank God, thank a doctor!
    8. Re:Get Off My Lawn! by 7+digits · · Score: 1

      > George Reese is the founder of [...] enStratus Networks LLC (maker of high-end cloud infrastructure management tools)

      So, it is just that the guy don't want to be pushed out f business. Of course, auto-scaling is good, unless you are an infrastructure management tools vendor...

    9. Re:Get Off My Lawn! by initialE · · Score: 3, Funny

      The second rule of automation is you do not talk about automation.

      --
      Starbucks, Harbuckle of Breath.
    10. Re:Get Off My Lawn! by julesh · · Score: 1

      Static pages & CoralCDN would keep a lot of websites from getting hammered off the internet.

      Unfortunately, most interesting sites need dynamic pages. If you're producing different page content for each user (based on preferences, past browsing history, geographic location, or anything else you can think of), you can't really do static pages. Unless you generate a lot of them. Coral can't help with this either, because it would result in each user seeing pages that were intended for another user.

    11. Re:Get Off My Lawn! by nine-times · · Score: 1

      The scaling logic is in your software. The cloud service shouldn't know best.

      In my mind that would depend to some degree-- whichever was a better solution given your needs. If the scaling logic of the cloud was much better than I could come up with without significant investment, and I weren't in a position to make a significant investment on that logic... well...

    12. Re:Get Off My Lawn! by nine-times · · Score: 1

      I'd say, for most sites, it's probably worth it to auto-scale first, and then page the human.

      That sounds reasonable enough to me. Sometimes you just have to analyze, "Given the risk of [event A] happening and the money I stand to lose if it does, and given the cost of doing what it takes to prevent [event A] from happening, is it worth investing in a system to prevent [event A] from happening?" And often you can't outright prevent Event A from happening, but you're just trying to make it more unlikely, or reduce the costs associated with that risk.

      So I think the question is, how much is "proper" capacity planning going to cost me, and how does that compare with the risks associated with scaling. So on the one hand, capacity planning might mean that having an expert in capacity on your staff, paying their salary to do that and paying the costs of implementing their recommendations, and losing money whenever they make a mistake and plan for the wrong capacity. On the other hand, you have the risks of auto-scaling, including the risk that the auto-scaling won't work or that it will scale to meet the needs of illegitimate traffic.

      So my question would be, what's the balance there for your specific application, and what can you do to either lower costs in the first case or lower risks in the second? So in terms of auto-scaling, I'd want to know whether anything can be done in the cloud to detect DDoS attacks and prevent that from driving up costs. I'd probably like to have something like your idea, where there would be various triggers of "if capacity exceeds A in time frame B, someone gets emailed/paged and is given the opportunity to override." I might even want to set a high upper-limit that says, "If capacity exceeds X in time frame Y, the service scales back or shuts down for timeframe Z." Stuff like that.

      I'm no expert in web services and auto-scaling, but the issues being brought up just seem like implementation issues that need to be figured out.

    13. Re:Get Off My Lawn! by smack.addict · · Score: 1

      Huh? enStratus and just about every other infrastructure management tool performs auto-scaling. It's a baseline feature, and you need tools like enStratus to do auto-scaling for you since Amazon does not (currently) support auto-scaling.

    14. Re:Get Off My Lawn! by SanityInAnarchy · · Score: 2, Interesting

      there would be various triggers of "if capacity exceeds A in time frame B, someone gets emailed/paged and is given the opportunity to override."

      Point is, the overriding should probably happen after the system has attempted to auto-scale.

      For instance, if I got Slashdotted, I'd probably want to scale to handle the load. If I have to be called in to make a decision before any scaling happens, I've probably missed an opportunity. On the other hand, if I've set reasonable limits, I then have the choice to relax some of those limits, or to decide I can't afford surviving Slashdot this time (or maybe realize it's a DDOS and not Slashdot), and pulling the plug -- but an hour's worth of extra capacity shouldn't kill me.

      Of course, that all depends on what kind of site you're running. Some sites might rather be taken completely down by a Slashdotting than spend too much on hosting.

      --
      Don't thank God, thank a doctor!
    15. Re:Get Off My Lawn! by nine-times · · Score: 1

      Yes, I understood that, and I was agreeing.

  7. Auto-rooting? by Gothmolly · · Score: 3, Funny

    So I hand over my business logic and data to a third party, who may or may not meet a promised SLA, and whose security I cannot verify? Does this mean I can be rooted and lose my customer data faster, and at a rate proportional to the hack attempts? Cool!

    --
    I want to delete my account but Slashdot doesn't allow it.
    1. Re:Auto-rooting? by smack.addict · · Score: 1

      That's a very poorly thought out view of a cloud infrastructure.

        With Amazon in particular, you do have SLAs and you can easily design an infrastructure that will be very secure for most organizational needs and exceed the SLAs offered by Amazon.

    2. Re:Auto-rooting? by Eskarel · · Score: 3, Interesting

      Well yes, you could also look at it from the point of view of. "I have a really clever idea, which will probably take off, and which if it does take off will require a lot of resources. I don't have a lot of money, but I can scrape together the cash for a small cloud investment and if my idea takes off I can afford as many servers as I want. I could buy a couple of regular servers and be unable to meet demand for several weeks while I order new equipment and possibly lose my start because people got sick of my site not being up, I could sell my idea to some venture capital people who, if they invest at all will take half my profits, or I can use the cloud, expand in ten minutes, and maybe make a lot of money without having to give it all to someone else".

      That's the strength of the cloud my friend, being able to start an idea without having to promise 90% of it to someone else to get funding.

    3. Re:Auto-rooting? by lysergic.acid · · Score: 1

      because we all know that self-hosted servers never get hacked or suffer down-time. and i'm sure a small business can afford better network & server management/equipment than Amazon, Google, or Microsoft.

      do you also keep all of your savings (which is no doubt in gold bullions) in a safe at home that you stand guard over yourself with a 12-gauge shotgun?

  8. Auto scaling by shaark78 · · Score: 0

    Its apparent that the blog author doesn't like auto scaling but I hope he planned for the slashdot effect that is gonna happen to his site now.

    1. Re:Auto scaling by Anonymous Coward · · Score: 0

      Yes, I thin oreilly.com can handle it.

  9. Capacity planning isn't that hard...for us by HangingChad · · Score: 3, Interesting

    While a content site might run the risk of getting slashdotted or Dugg, that isn't necessarily a big risk for applications. And your platform choice makes a big difference. We do our business applications on a LAMP stack. If we need capacity, we can stand it up for the cost of hardware. Nice thing about LAMP is at least the AMP part is OS portable, so we can rent capacity where ever it's cheap. So far we haven't needed to do that but it's nice to have the ability.

    To date we haven't run into any problems. If we're expecting a surge of new customers, we have a pretty good idea of expected traffic per customer. We can stand up the capacity well in advance. Hardware is cheap and can be repurposed if end up not needing all the extra capacity.

    Our platform choice gives us a tremendous amount of flexibility. You don't get that with Windows. Any increase in capacity has a significant price tag in license fees associated with it. Once you build the capacity there are fairly significant ongoing expenses to maintain it. You can take it offline if you need to scale down but you don't get your money back on the licenses. There's a whole new set of problems outsourcing your hosting.

    I like our setup. The flexibility, the scalability, the peace of mind of not struggling with capacity issues, negotiating license agreements with MS or one of their solution providers and not being limited to their development environment. We can build out a lot of excess capacity and just leave it sit in the rack. If we need more just push a button and light it up. I'm not sure an Amazon or anyone else could do it cheap enough to justify moving it. And I really like having the extra cash. Cash is good. Peace of mind and extra money...what's not to like? Keep your cloud.

    --
    That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
    1. Re:Capacity planning isn't that hard...for us by Holistic+Missile · · Score: 1

      ...Nice thing about LAMP is at least the AMP part is OS portable...

      Careful with that - some nuances will turn up that will bite you on the ass. I found out last year that Apache's MD5 module creates different hashes(!) on Windows than it does on UNIX.

      I finally convinced my employer to use Subversion to provide version control on our Pro/E CAD files by bringing in my BSD server and doing a demo for the bosses. It was a beautiful setup, including ViewVC so the gals in Customer Service have access to drawings and visibility into what has been completed. Our IT guy is a standard Windows guy, so I had to set it up on a Windows box for him. I fought that, but politics won! I included a web page to change the Apache passwords, which worked fine on Linux and UNIX using Apache's MD5 module. That was when I found out that the MD5 results were different - none of our passwords would work until we re-created all of them using htpasswd(.exe) on the Windows box. Eventually, I wound up just authenticating users off of the AD server, which was a more elegant solution anyway.

      --
      When you're dead, you don't know you're dead. It only affects the people around you. Same thing when you're stupid.
    2. Re:Capacity planning isn't that hard...for us by Anonymous Coward · · Score: 0

      If you think windows can't scale cheaply, you haven't heard of mosso.

    3. Re:Capacity planning isn't that hard...for us by Wonko · · Score: 3, Insightful

      Careful with that - some nuances will turn up that will bite you on the ass. I found out last year that Apache's MD5 module creates different hashes(!) on Windows than it does on UNIX.

      If that is true then at least one of them isn't actually generating an MD5 hash.

      I'm just guessing, but I bet you were also encoding the line ending characters. That would be encoded differently on Windows and UNIX, so you'd actually be hashing two strings that differed by at least one byte.

    4. Re:Capacity planning isn't that hard...for us by Holistic+Missile · · Score: 1

      That's brilliant! I didn't even think of that stupid '^M' business... That is the exact kind of nuance I was referring to in my previous post.

      At the time, I found myself with a totally unexpected problem in the process of rolling out the system. The pressure that you feel when there's a problem, after telling your boss that it will run on Windows with no problem, just sucks! I was trying to spare somebody that pain.

      --
      When you're dead, you don't know you're dead. It only affects the people around you. Same thing when you're stupid.
    5. Re:Capacity planning isn't that hard...for us by julesh · · Score: 1

      Nice thing about LAMP is at least the AMP part is OS portable, so we can rent capacity where ever it's cheap. So far we haven't needed to do that but it's nice to have the ability.

      Only if you're very careful. I'm going to assume that by LAMP you mean Linux/Apache/MySQL/PHP as that's the most common meaning these days. Some of what I say also applies to PERL and/or Python, which also sometimes end up in the same acronym.

      The first thing to be aware of is that you're likely using Apache with the prefork MPM under Linux. This is the way that works best in that environment; unfortunately, its performance totally sucks in Windows, so if you switch you will want to switch to the multithreaded worker MPM. PHP carries a warning in this case: a lot of its standard libraries are not thread safe. They've done a lot of work on improving the situation, and a lot of PHP sites will work flawlessly, but it's not 100% reliable yet.

      Secondly, be aware that file naming conventions differ. This may be obvious, but if your scripts assume they can stick a file in /tmp, they're going to be very surprised when you run them on Windows. Fortunately MS have been smart enough to make most of their APIs recognise '/' as a path separator character, but be aware that if you need to pass filenames to an externally exec'd program this may or may not work.

      Thirdly, consider file line ending differences. These won't affect most applications, but there are a few that might blow up.

    6. Re:Capacity planning isn't that hard...for us by Slashdot+Parent · · Score: 1

      We can build out a lot of excess capacity and just leave it sit in the rack. If we need more just push a button and light it up. I'm not sure an Amazon or anyone else could do it cheap enough to justify moving it.

      With EC2, I can have a server fully configured and operational in 90 seconds at the cost of $0.10. How quickly can you get a server up and running, and at what cost?

      That being said, EC2 is not for everyone, and it may not be for you. The whole point of the Elastic Computer Cloud, is that you bring up and shut down instances as needed. If your computing needs are static, and it sounds like yours are, then EC2 starts to get expensive. Their smallest server costs $72/mo+bandwidth if you leave it running 24/7.

      But if your requirements are elastic (load spikes, nightly/monthly processing, cold spare, etc.), then EC2 is a godsend. I'm guessing that your current provider can't give you a server for an hour of processing for $0.10. And that $0.10 is really $0.10. There are no minimums, start-up costs, contracts, commitments, or anything. You pay only for what you use. Incredible!

      P.S. If you need Windows instead of Linux or OpenSolaris, it will cost you $0.125/hr because you need to pay for Windows licensing.

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
  10. I agree by sleeponthemic · · Score: 1

    When in doubt, use a ladder. Elevators cannot be trusted.

    --
    I record my sleeptalking
  11. He assumes too much by tpwch · · Score: 3, Insightful

    He seems to be assuming that you only want to run a website on this service. I don't think hosting websites on this kind of service is a good idea at all. There are many other types of application you run on clould computing infrastructure, which makes much more sense, and negates almost all of his claims.

    Consider for example a rendering farm. One day you may have two items to render. Another day 10 items. The next day 5 items. Should you really scale up and down manually each day, when you could just as easily just start the amount of servers you need based on how many jobs have been submitted for that day, and how large the jobs are?

    There are many other examples. Websites are not the only thing you run on these services.

    --
    Posted by a Debian GNU/Linux user
    1. Re:He assumes too much by Cylix · · Score: 2, Interesting

      What if someone posts a bad batch or accidently malforms some package in such a way to chew though 10x the resources.

      I think there are many great uses for cloud environments, but people have to be careful when it is pay for play.

      It's a bit different then tying up all the resources on the web server. Sure, there is cost in time, but rarely does anyone get billed for those man hours.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    2. Re:He assumes too much by Animats · · Score: 2, Informative

      Consider for example a rendering farm.

      Such as ResPower. They've been around for a while, from before the "grid" era (remember the "grid" era?). This is a good example of a service which successfully scales up the number of machines applied to your job based on available resources and load. Unlike a web service, though, ResPower normally runs fully loaded, and charges a daily rate with variable turnaround, rather charging for each render. (They do offer a metered service, but it's not that popular.)

      It's worth looking at ResPower because, unlike most of the "grid" or "cloud" services, they have an established customer base and make money.

  12. Smart Auto-scaling by omnilynx · · Score: 1

    This guy makes a good case against "dumb" auto-scaling; that is, doing a simple "more traffic = scale up" calculation. However, it should be trivial to create more sophisticated algorithms that eliminate or at least reduce the problems he gives. For example, a module that can "recognize" DoS attacks versus slashdotting in most cases and either block or scale based on the results shouldn't be hard.

    --
    ceci n'est pas une .sig
    1. Re:Smart Auto-scaling by Cylix · · Score: 1

      Not all DoS attacks are simple ping floods. Those are in fact the weakest of the breed and easy to clean out at the upstream provider.

      An attack designed to chew up your instances would perform valid page requests. Thus, your application would believe that more hits equals more traffic and it should accommodate.

      I'm all for a temporary buffer with severe limits and the big red light going off. The cost associated with a temporary reprieve in order to react to a situation would be well be worth it.

      It's like putting up the shields on the enterprise. Obviously, sitting back and relaxing isn't what the crew on the bridge are doing.

      "Scotty, send us up a few more bottles of chilled wine. I think these guys are going to tire out fairly quickly."

      "I'm givin her all she's got captain, but the ice machine just can't take any more."

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
  13. You don't even actually save money by using cloud by Skal+Tura · · Score: 2, Insightful

    Yeap, that's right. With over 7yrs of solid hosting industry experience, it's very easy to see.

    Atleast Amazon's service is WAY overpriced for long term use. Sure if you need it just for few hours ever it's all good, but for 24/7 hosting it ain't, none of them.

    It's cheaper to get regular servers, even from a very high quality provide than to use amazon's services.

    Best of all: You can still use their service to autoscale up if you prepare right, and yet have low baseline cost.

    If it's only filehosting service you need, the BW prices amazon offers are outrageous, take a bunch of cheapend shared accounts, and you'll get way better ROI, and still, for the most part, do not sacrifice any reliability at all. Cost: Greater setup time, depending upon on several contingency factors.

    Case examples: you can get from bluehost, dreamhost etc. plenty of HDD & Bandwidth for few $ a month. Don't even try to run any regular website on it, they'll cut you off (CPU & Ram usage), but for filehosting, it's great bang for buck :)

    Scared of reliability? Automatically edit DNS zone according to locations availability and have low(ish) TTL. Every added location increases reliability.

  14. smugmug uses autoscaling for image processing by adpowers · · Score: 1

    I posted this as a comment on the blog post, but I'm copying it here as well:

    http://blogs.smugmug.com/don/2008/06/03/skynet-lives-aka-ec2-smugmug/

    Outside of one instance where it launched 250 XL nodes, it seems to be performing pretty well. Their software takes into account a large number of data points (30-50) when deciding to scale up or down. It also takes into account the average launch time of instances, so it can be ahead of the curve, while at the same time not launching more than it needs.

  15. Re:You don't even actually save money by using clo by Cylix · · Score: 1

    Those are horrible examples.

    Cheaper environments can be shared resources, have poor SLAs and not provide service gurantees. Sure, you can run cheap and it won't cost you until it breaks.

    DNS isn't exactly a real time solution when those entries are cached. I have encountered a large number of providers who flat out ignored cache time out settings.

    Again, a business can run on the cheap, but the idea is the servers are generally generating revenue when they are in use. Some places don't like down time.

    --
    "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
  16. One of several anti-cloud arguments by mattbee · · Score: 2, Informative

    I did some rough cost comparisons for a high-traffic web site in my similarly cynical article a few weeks ago (disclaimer: I run a hosting company flogging unfashionable servers, and am not a cloud fan yet :) ).

    --
    Matthew @ Bytemark Hosting
    1. Re:One of several anti-cloud arguments by symbolset · · Score: 1

      Cloud is a nice, nebulous term. I would count the people who really know what it means to be less than one hundred in all the world.

      I'm guessing from your post and its related link that you're not one of those.

      The cloud is a symbolic abstraction that separates the served from the servers in the same way that client/server architecture does, except that it adds an additional layer of abstraction for "server" that allows for servers to be hosted anywhere redundantly and transparently. It assumes a number of things, including ubiquitous networking, failover from geographically disparate servers, and the persistent reliability of service providers. That last can be avoided by designing the architecture yourself and becoming your own service provider.

      Now I'm going to cover the idea of command and control. The cloud becomes a useful element of your command and control iff (if and only if) you own all of the disparate parts of it. Any other answer adds a lack of control to your architechture you cannot control if you must persist for longer than the solitary existence of your service provider.

      This doesn't eliminate the cloud's utility - it just limits its scope. It's a grand idea and you should use it to the extent that you maintain control of your own cloud. To do anything else is not responsible. You must design your cloud so that the servers are owned by you and the connection service is provided by redundant providers. That's what botnet spammers do and you owe your customers no less.

      --
      Help stamp out iliturcy.
    2. Re:One of several anti-cloud arguments by tcopeland · · Score: 1

      > I run a hosting company flogging unfashionable servers

      And you provide a RubyForge mirror - many thanks for that!

    3. Re:One of several anti-cloud arguments by mattbee · · Score: 1

      Cloud is a nice, nebulous term. I would count the people who really know what it means to be less than one hundred in all the world.

      I'm guessing from your post and its related link that you're not one of those.

      Thankyou for your insight professor :) There are a lot of disparate hosting offerings out there marketed with the word 'cloud'. I understand the abstraction but the commercial reality is the interesting debate: how it develops, and how useful it is.

      --
      Matthew @ Bytemark Hosting
    4. Re:One of several anti-cloud arguments by julesh · · Score: 1

      Good points. I think you're right: cloud services have a long way to come in terms of cost, and I'm not sure that'll happen in the near future. And that scalability isn't relevant to most people, anyway. The number of sites that can't be managed by a single commodity server are small, and that can be scaled right down to a virtual host on a machine shared with 30 other similar sites for the low end. Virtual machine software (e.g. Xen) makes it easy to migrate to a host with more capacity as and when it becomes necessary, and I believe this can even be achieved without downtime. So why would most of us need the cloud, when a competent traditional hosting provider ought to be able to do that anyway?

    5. Re:One of several anti-cloud arguments by symbolset · · Score: 1

      Good lord I'm a dumbass sometimes. I needed that.

      --
      Help stamp out iliturcy.
    6. Re:One of several anti-cloud arguments by symbolset · · Score: 1

      Note to self: no posting on slashdot on party night.

      --
      Help stamp out iliturcy.
  17. Re:You don't even actually save money by using clo by chrb · · Score: 1

    bluehost, dreamhost etc. plenty of HDD & Bandwidth for few $ a month. Don't even try to run any regular website on it, they'll cut you off (CPU & Ram usage)

    Not having used the providers in question, I have to ask, why shouldn't I try to run a regular website on them? Isn't that exactly what they do - web hosting, of regular web sites? There's no reason why a regular web site should use excessive amounts of CPU or RAM.

    but for filehosting, it's great bang for buck :)

    I just skimmed the DreamHost TOS and saw that they explicitly ban "File upload / sharing / archive / backup / mirroring / distribution sites." Maybe not that great for file hosting after all...

  18. Re:You don't even actually save money by using clo by dubl-u · · Score: 1

    Take a bunch of cheapend shared accounts, and you'll get way better ROI, and still, for the most part, do not sacrifice any reliability at all. Cost: Greater setup time, depending upon on several contingency factors.

    Are you seriously proposing this as a way to run a business? That strikes me as seriously retarded. I know a lot of people who run a lot of sites, and depending on their bandwidth draw and other needs, they'll rent servers, they'll rent a cabinet and buy bandwidth, or they'll use one of the reasonably priced CDNs. But I've never heard of anybody doing this unless they're running something semi-legal and want to dodge MAFIAA threat letters.

    Swapping your shit around between a bunch of cheap hosting accounts strikes me as a) very Fisher-Price, and b) totally pointless. Good sysadmins cost real money, and for all the time spent on jiggery-pokery, it seems like a much better deal to just get a discount CDN or a cheap colo'd pipe, and let the sysadmin spend their time on something useful.

  19. An odd argument by chrb · · Score: 3, Insightful

    His argument basically boils down to "Auto-scaling is a bad idea because you might implement it badly and then it will do the wrong thing". Isn't that true of everything? The flip side, is that if you implement it well, then auto-scaling would be a great idea!

    It's like saying that dynamically sized logical partitions are a bad idea, because you should just anticipate your needs in advance and use statically sized partitions. Or dynamically changing CPU clock frequencies are a bad idea, because you should just anticipate your CPU needs and set your clock frequency in advance. Or dynamically changing process counts that adapt to different multi-core/CPU availability factors are a bad idea... you get the picture.

    The idea that some computational factor can be automatically dynamically adjusted isn't necessarily a bad idea, it's just the implementation that might be.

    1. Re:An odd argument by smack.addict · · Score: 1

      No, the argument is that auto-scaling, upon close examination, has very few benefits that are not actually better realized through other mechanisms.

    2. Re:An odd argument by Anonymous Coward · · Score: 0

      Yes, his argument boils down to that, but it makes sense in the current context. People new to auto-scaling always think that it is something they can just plug-in and never have to worry about scaling to cover demand again.
      So his argument is, "understood like *that*, it's a bad idea". He's telling people they have to rethink their idea of what auto-scaling is and how it should be used.
      After a couple years, people will have get used to the concept and will frame auto-scaling correctly, just like all those other mature technologies you mention.

  20. "Tragedy of the Commons" by garyebickford · · Score: 1

    Another risk, at least in theory, is a kind of very short term "Tragedy of the Commons".

    In the long term (a function of the Amazon accounting timeframe - maybe minutes, hours, or days) it may not be a problem because rational customers whose systems work correctly will voluntarily limit their usage in a predictable manner.

    However very fast DDOS, for example, of several autoscaling systems could cause a system-wide failure before the Amazon accounting system and customer strategies kicked in.

    Because the cloud is by nature distributed, there is no central algorithm that can prevent this (other than providing more computer capacity than can be consumed by the maximum incoming traffic on the pipes). However it is possible to construct an 'immune response' type of corrective strategy, analogously to the way our bodies respond to a sudden stress.

    --
    It's easier to be a result of the past, but more fun to be a cause of the future! http://www.spacefinancegroup.com/
  21. Clouds are good by Anonymous Coward · · Score: 0

    Clouds are a good thing, es[ecially at this time of year. Its a clear sky outside right now, and the temperature is down to 254 Kelvin.

  22. Re:You don't even actually save money by using clo by Anonymous Coward · · Score: 0

    Yes, Amazon's not the best for 24/7 hosting - the best idea is to have an infrastructure which can run partially on Amazon and partially on dedicated boxes. Your dedicated boxes handle your standard load so you only need to scale to Amazon occasionally.
    That said, the completely virtual infrastructure lets you avoid dealing with datacentres, buying loads of servers, etc.

  23. this guy contradicts himself by gravisan · · Score: 1

    "The dynamic scaling to plan can also be automated" massive retardation here.

  24. Stupid by Free+the+Cowards · · Score: 2, Insightful

    I can summarize this article in one sentence:

    "X is only useful for those who are too lazy to do Y."

    It's been said about assembly language, high-level languages, garbage collection, plug-n-play, and practically any other technology you can name. It is not actually a valid criticism.

    --
    If you mod me Overrated, you are admitting that you have no penis.
  25. Autoscaling is a ticking time bomb by upuv · · Score: 2, Interesting

    On the surface auto-scaling is obviously a great thing. But it doesn't take much thought to start punching holes in it.

    Lets first look at the Data center that provides such a glorious capability.
    1. It is their own best interest for you to scale up. Scale up processing, disk, bandwidth or what ever. For the simple reason it's more money. Since you signed the contract you will probably be scaled well and truly before you know it. Usually you only find out when the bill comes in.
    2. The data center has very little incentive to make sure you are notified in a timely manor of autoscaling. As a matter of fact this feature is usually crippled or even broken. I don't care what the contract says. The datacenter rarely honors this part of the contract to anyones satisfaction.

    Now lets look at the client and the horrible things that can go wrong. By no means even remotely a complete list.

    The new version of the app list.
    1. Bob the developer forgets to index that new DB table. Database goes nuts trying to do a simple select. BAM autoscaling of DB CPU resources goes through the roof.
    2. New AJAX call is not properly tested. For some reason it now triggers div refreshes as the mouse moves. App server is now flooded. BAM Band Width and CPU autoscale through the roof.
    3. App no longer properly caches that all important query. BAM again DB and APP CPU skyrockets.
    4. The genius in dev decides to make the jsession stateful. Works fine on the desktop. Works fine when load test hammers only 10 users. Oh Oh real world kicks in, in Prod. We have 10k users. Everything goes through the roof.
    5. The list of new version issues goes on.

    The bad guys come a knocking.
    OK so now your a hot property on the net and you sign up for autoscaling so that you don't have to worry about capacity planning. You are focused on that cash machine that is your cool app.
    1. You didn't know about that monster hole in the app. The bad guys inject a phishing site onto your Uber site. The phishing site is wildly successful. Oh crap we just paid for the biggest fraud site on the net.
    2. The dev team leaves that back door on the site so they can maintain it remotely. Oh Oh all of a sudden you notice port 25 traffic is off the charts from the site. OMG we just uploaded 25Tbtyes in the last 24hours. You have just joined the ranks of the largest SPAM generators on the planet. You have a monster bandwidth bill and a very expensive legal bill.
    3. What are these very large globs in the database all of a sudden. OH crap we left a hole and are vulnerable to SQL injection. OH crap it's all encrypted kiddie porn. Bills for bandwidth, disk and legal come a knocking.

    I do have experience with this sort of thing. And it always goes sour at some point. The techies are always overruled by the marketing and business types on this. As the deal is always so great on paper. At some point something will go wrong. Software is never perfect. Between defects and bad guys you are a sitting duck for the big man carrying the bill to your door. It's only ever a matter of time.

    Oh and lastly. Geuss what some times the autoscaling fails. Make that a lot of the time. And you are then off the air.

    The best situation is for you as a customer of scaling is to have a close relationship with the supplier. Once you start to reach certain predefined levels of usage they should contact you and give you the option of an upgrade. Make the scaling feature by human choice. Never let the supplier decide that for you.

    1. Re:Autoscaling is a ticking time bomb by Cylix · · Score: 1

      The vendor should never be responsible for resource scaling. There is no better judge of resource allocation then your own organization. The good news (or bad news) is that if the entity is incapable of self governing then it will not be an issue in the long term. Infrastructure will eventually topple under it's own inability to sustain itself.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
  26. Re:You don't even actually save money by using clo by lysergic.acid · · Score: 1

    well, they actually provide an online storage service with at least some of their web hosting packages. you just can't use it for public data storage.

    so if you yourself want to backup a few hundred gigs of personal files that only you will have access to, you can (as long as it's not pirated material). though if you create a dreamhost account just so you can dump your company's 200 TB data warehouse onto their servers and exploit their "unlimited" storage offer, then you'll probably run into some trouble.

  27. It really depends on your business model by PornMaster · · Score: 2, Insightful

    When your revenues scale with the services rendered, it *does* make business sense to auto-scale. Auto-scaling is a technical solution, not a business one. Being Slashdotted isn't typically associated with more commercial activity, it's associated with "hit-and-run" visitors. The same with social networks. Does Twitter even have a business model? But wherever there's a business model where margins are relatively stable but activity rises and falls, auto-scaling makes you money rather than costing you severely. Like many things, it's a tool which should be used wisely, where not paying attention can leave you missing fingers.

    1. Re:It really depends on your business model by Cylix · · Score: 1

      While he mentioned "slashdot" very rhythmically there are other instances which can chew through resources quite aggressively.

      Denial of service attacks
      Malformed software (via bad code push or bug)
      References or page views which do not translate to customers.
      Poor design choices. (intentional, but bad)

      There are several permutations of the core issues regarding resource utilization, but the end result is the possibility of auto-scaling to compensate. Unlike traditional home owned infrastructure there will be more to the cost then over time or extra coffee.

      I don't agree wholly with his mantra of never, but everyone seems to be missing the point.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
  28. Hunh? by symbolset · · Score: 1

    You lost me at x arguments were true, therefore x arguments are not true. Could you please start over again with more steps?

    --
    Help stamp out iliturcy.
    1. Re:Hunh? by shivamib · · Score: 0

      You lost me at x arguments were true, therefore x arguments are not true. Could you please start over again with more steps?

      Bonus points for flying car analogies

  29. Re:You don't even actually save money by using clo by bendodge · · Score: 1

    I use Bluehost and feel like I get my money's worth. They put tons of sites on fairly beefy linux servers, which is fine most of the time. The problem is that every now and them someone else's site or script runs out of control or the whole box gets DOSed (seriously, they attack the box's CPanel IP instead of a specific domain). There are also CPU limits (idk about RAM). I've only run into the CPU limits when batch resizing images with a photo gallery. Your account goes offline for 5 mins when that happens.

    I run several mostly static domains and a moderately active forum, and I'm generally satisfied. The #1 problem is the server load bouncing around because there are so many sites on one box. The load usually hangs around under 8, but sometimes it will bounce from ~20 to 60 for a few hours for no apparent reason. My sites will be noticeably slower, but it's still responsive until the load goes over 80 or so.

    The main reasons I keep them instead of switching to someone like NearlyFreeSpeech are CPanel 11 (with lots of drop-in scripts), the cheap storage (I host a lot of files), and the email hosting, which is pretty reliable and offers fast IMAP. Think of it as a nice apartment with lousy tenets.

    --
    The government can't save you.
  30. Re:You don't even actually save money by using clo by Cylix · · Score: 1

    That's fairly aweful since you put it that way.

    Modern virtualization allows for limits per instance.

    The reason it's cheap is you are only getting an apache vhost. I don't think it matters what address they are attacking.

    It's not a fair comparison to say this shared host provider is cheaper then X cloud provider. Perhaps looking at the cost of leasing a virt would be a better comparison.

    In the end, you get what you pay for and that is a very inexpensive setup.

    --
    "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
  31. You can keep it! by Anonymous Coward · · Score: 0

    Sorry folks...the cloud sounds like an all around bad idea to me. I will contimue to run apps on my own computer, and keep my files on my own hard drive .NOTHING that you put out there on the net, (or in the cloud) is safe. You all can keep your cloud!

  32. Ever heard of "SLA"? by mcrbids · · Score: 2, Interesting

    I have. My company lives (or dies) by the !@# SLA.

    Our agreements require no less than 99.9% uptime, about 8 hours of downtime per year. We never gotten close to that - our worst year was about 2.5 hours of downtime because of a power failure at our "fully redundant" hosting facility.

    In this world, where I have up to 8 hours per year, 10 minute response would be a god-send. We've just spent *alot* of money revamping our primary cluster so that we now operate with 100% full redundancy on everything. Redundant network feeds. Redundant logic servers. Redundant load balancers. Redundant database servers. All with auto failure, dynamic routing with DNS. (which is, itself, very failure tolerant)

    But an application has to be constructed in a very particular way in order to scale, particularly if data integrity is important. (EG: ACID compliance SQL) This is often counter-intuitive and non-obvious, and porting an existing application to such an environment is not a quick investment. It's very typical to give up raw performance for performance scalability. We've devoted approximately 6 man-months over the past year to take full advantage of clustered, redundant computing in order to try for 1 hour over the next year along with near-linear scalability.

    It's not just about capacity - it's about keeping all those !@# servers organized and coordinated!

    Bottom line? Take a look at your SLA.

    In our case, if we suffered a few hours of downtime every year or so, it would be an inconvenience to our users and clients. In any event, our uptime is best-of-breed in our niche-ish industry, but I'd put our uptime as mid range for hosted products overall, when you include companies that are much bigger than our still-somewhat-small rapidly-maturing startup.

    Spend money where it counts. This requires an understanding of your economic base. If somebody slashdots your site, is that your golden opportunity, or is that an annoyance. In our case, a few hours of downtime if we got slashdotted wouldn't cause any particular long-term problem if it brought us down. If you have a few hundred customers paying $10/month for some cheap-o websites, a few hours of downtime every year or two won't cause much problem.

    --
    I have no problem with your religion until you decide it's reason to deprive others of the truth.
    1. Re:Ever heard of "SLA"? by julesh · · Score: 1

      I have. My company lives (or dies) by the !@# SLA.

      Our agreements require no less than 99.9% uptime, about 8 hours of downtime per year. We never gotten close to that - our worst year was about 2.5 hours of downtime because of a power failure at our "fully redundant" hosting facility.

      Hmm. When my last hosting provider had a power failure at their hosting facility, it took them more like 2.5 days to get all their clients back up and running. Turns out some of the machines hadn't been rebooted for years, and when they tried to reboot them they wouldn't come up again.

      SLA said 99.9%. Compensation limited to one month's hosting costs. Always check the limit...

      Needless to say, I'm on a different provider now.

  33. Questionable Motives by Anonymous Coward · · Score: 0

    A number of readers have correctly noted the poor logic in George Reese's article. Additionally there is some question as to his motives.

    George Reese runs an early stage start-up called enStratus that doesn't offer auto-scaling capabilities. His competitors such as Right Scale offer auto-scaling as part of their applications.

    I can certainly understand why someone whose company is missing a major feature, relative to the competition, would argue that the missing feature is not important.

    1. Re:Questionable Motives by julesh · · Score: 1

      A number of readers have correctly noted the poor logic in George Reese's article. Additionally there is some question as to his motives.

      George Reese runs an early stage start-up called enStratus that doesn't offer auto-scaling capabilities. His competitors such as Right Scale offer auto-scaling as part of their applications.

      I can certainly understand why someone whose company is missing a major feature, relative to the competition, would argue that the missing feature is not important.

      It's also very understandable why someone who believes a feature is unimportant might not include it in their product.

    2. Re:Questionable Motives by smack.addict · · Score: 1

      enStratus has that feature.

  34. You're both wrong... by SanityInAnarchy · · Score: 1

    It's 4pm on a Saturday, and your site is getting hit hard. Rally the troops, call a meeting, decide the proper action, call Fedex to ship you more infrastructure, deploy new hardware, profit from your new customers, all the while laughing at the fools who waited 10 minutes for their cloud to auto-scale.

    RTFA. The author specifically makes the case for dynamic scaling, just not auto-scaling.

    That is, you rally the troops, call a meeting, decide the proper action, and have someone do an 'ec2-run-instances' command.

    It's 4pm on a Saturday and chances are that your site is being hit hard either because you were being an idiot or because someone is engaged in an attack on you.

    Or you got Slashdotted.

    If you plan properly, there are no sudden 4pm on Saturday spikes in traffic.

    If you plan properly, you are prepared for the typical 4pm-on-Saturday spikes, if those are typical for you.

    Which does nothing if you then get Slashdotted at 7 AM on a Sunday. Or whenever.

    As to which is better, the question you have to ask is, what is the cost of not responding to that sudden spike in traffic?

    --
    Don't thank God, thank a doctor!
  35. Heh by Johnno74 · · Score: 1

    +1 Irony to the author of TFA, if the article becomes slashdotted....

  36. Re:You don't even actually save money by using clo by Skal+Tura · · Score: 1

    Because of any degree of higher traffic (think 100k visitors a week) and you get suspended, that's why running an regular website on it sucks, unless you have very low traffic website. Nevermind their CPUs & ram are anyways quite damn busy -> slow page views.

    Filehosting: Ie. installation file of your application is not included in that, while being technically distribution, not distribution in the sense of the TOS, which interpreted means sites where you have the latest game demos for download.

  37. Re:You don't even actually save money by using clo by Skal+Tura · · Score: 1

    No jiggery is needed, ie. swapping hosts etc.

    Done right there's no problems at all. Just because something is CHEAP doesn't mean one couldn't utilize it ;)

    Everything has their own place and time, what you are saying is like Mini-ITX setups should be banned and never used because they are so cheap and doesn't offer performance.

    Setup once, forget then. You get to run at a cost of say 40$ a month with 4 locations, versus 100-250$ a month with one 1 location, and practical usable bandwidth 1/10th.

    besides, ones the setup is done, and you need to keep swapping places, you've done something wrong, or are trying to acheive too high of an ROI. Also, after initial setup done, almost any monkey can setup new locations shall the need arise.

  38. Re:You don't even actually save money by using clo by Skal+Tura · · Score: 1

    When you have say 10 different locations setup, one being utilized when down for 10minutes accounts still for 90% uptime on the period.

    However, that being said, there are mission critical application, and i never said this is the perfect solution for everyone.

    Also there are other means to load balance, say you want a host a single file, ie. your application on each of these, on your main website you have download page, which chooses the mirror according to availability.

  39. Classical automated control problem by Anonymous Coward · · Score: 0

    This is a classical automated control problem. A good analysis of this topic should be done by a control expert.

    Should the problem be solved by some type of feed-forward controller based on some assumptions as the author favours, or should the controller be a feed-back controller as the author critizises or should it be a combination?

    It would be interesting see an analysis of someone who knew the topic of automatic control well.

  40. Think bigger (Was Re:Get Off My Lawn!) by An+dochasac · · Score: 1

    The biggest flaw I see in autoscaling isn't that it isn't fast enough or might cost too much (in both cases it beats the current "scramble out a new server" or "continuous overcapacity" solutions. The biggest flaw is that it doesn't go far enough. I see it as only the first baby step torwards Transcontinental Demand Load Balancing

  41. Also auto-budgeting (;-)) by davecb · · Score: 2, Interesting

    This reminds me of a large company which outsourced enthusiastically, until at one point they discovered they'd outsourced decisions about maintenance... causing the outsourcer to have control over the maintenance budget.

    As you might expect, after it ballooned, they started in-sourcing!

    Giving others control over financial decisions is almost always unwise, even if doing so is the newest, coolest idea of the week.

    --dave

    --
    davecb@spamcop.net
  42. Re:You don't even actually save money by using clo by dubl-u · · Score: 1

    So you're just talking static sites? Using cheap hosting plans as a dodgy CDN? If so, I've got no issue with that. But $60 a month pays for little sysadmin time, and not much more monkey time.

    If people are having that kind of traffic, it's worth starting to think about how to make their project sustainable. Things that are pure cost tend to disappear. Figuring out how to match revenues with costs means the project is much more likely to last.

    People should also be a little afraid of hosting companies when doing this. Hosting companies offer their low-end packages with the expectation that most people will use almost nothing. And that those who do will eventually swap up to higher packages. If you balance your traffic so you're fully using a bunch of low-end packages, you will probably cost the hosting company more than they are making on you. Even if their AUP currently allows it, they may find reasons to give you poor service or close your account entirely.

    Business relationships are only sustainable when both sides are getting good value.

  43. Is someone trying to justify their job? by AmericanBlarney · · Score: 1

    Personally I am not opposed to some degree of capacity planning, but the very example used repeatedly in this article undermines the premise. Who ever knows when they're about to get \.ed? How can the tech guys know the exact impact that the sales teams' latest promotion is going to have on traffic? And 10 minutes startup time isn't all that bad if you have software that looks at the traffic trends rather than waiting for their to be an actual capacity shortage. It might not be perfect, but it's better than trying to plan (and by plan, I mean guess) exactly how much capacity you will need at any given time.

  44. Obvious.... by Anonymous Coward · · Score: 0

    so obvious it hurts...

  45. Scale First, Ask Questions Later by Slashdot+Parent · · Score: 1
    1. It does not take 10 minutes to launch a server in EC2, even if you don't know what the hell you're doing. My servers launch in about 90 seconds, but I've taken the time to make my own custom images that are optimized to boot quickly.

      A web head that just uses one of the stock EC2 images, and then uses the distro's package manager to install apache, etc., is going to take, at most, 5 minutes to come online. Yes, that's right, you can specify a boot-up script for your images, and the boot-up script can call 'apt-get install' to your heart's content.

    2. Unless you get permission from Amazon, you can run a maximum of 20 instances. (Permission is not hard to get, but you do have to fill out a form and tell them why you need so many instances.) The cost of running 20 of their highest-cost images? $18.00 per hour. So as long as your auto-scaling solution pings you somehow when it scales up above a threshold that you're comfortable with, you can always get yourself to a computer and override what was done automatically. $18.00 is not going to break the bank, hopefully.

    Summary: Yeah, you should do capacity planning, but auto-scaling is important to handle unexpected traffic surges. If your application is slow for 90 seconds, the world is not going to end.

    --
    They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
  46. There Are Hard Limits by Slashdot+Parent · · Score: 1

    Cloud companies will likely not make metering very easy or cheap because they *want* you to get carried away.

    I've only used Amazon EC2, but I can tell you for a fact that they make it very easy for you to know where you stand. And yes, they also have hard limits.

    With EC2, you are limited to 20 concurrent instances unless you request more. The cost of running 20 of their highest-priced servers is $18.00/hr. So as long as your auto-scaling system pings you when your resources go over your comfort threshold, you should be able to get yourself to a computer, cellphone, whatever, and override what your auto-scaler did.

    Also, with EC2 you can always log into your account and get an up-to-the-second, detailed account activity listing. There is no surprises. They even provide a detailed calculator so you can forecast what your AWS bill will be.

    EC2 is highly transparent. If you can spare $0.10, give it a try sometime. It's pretty neat.

    --
    They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
  47. You Miss The Point by Slashdot+Parent · · Score: 1

    You are saying that static compute requirements are better met by static computing platforms. Well... duh.

    The whole point of the Elastic Compute Cloud is that it is for elastic computing use cases (usage spikes, nightly/monthly/periodic heavy processing, cold spare, etc.) It's not supposed to be cheaper than a dedicated server.

    Let me tell you one way that I use EC2, and you tell me if you can give me what I want for cheaper. I own several apartment buildings, and I run my business website on a lousy, inexpensive, totally-inappropriate-for-business-use webhost (dreamhost). My website at dreamhost goes down every few months for minutes, hours sometimes, but I don't mind.

    I have a process running that tries my business website every 15 seconds, and if I get 4 straight failures, the website is automatically failed-over to EC2, and DNS is automatically remapped. The entire process from first detected failure to my website's return to operation takes about 2.5 minutes, but obviously any client who has the wrong IP in his or her cache will take a while to access the site on EC2. This is a level of downtime that I am more than willing to live with.

    The cost of this service? $0.10/hr, but only when I am using it. Can you provide me this service for less? My EC2 bill (for that usage, anyhow) runs about $1.50/year. Can you provide that service for $1.50/year or less?

    --
    They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
    1. Re:You Miss The Point by mattbee · · Score: 1

      We don't do any hosting for $1.50/yr :-)

      However I'm not sure the use case you talk about is in any way a typical hosting task, probably because it gives you less overall uptime rather than more. If you have any clients on big hosts that assume 1-day or 1-week DNS, and they pick up the Amazon IP which is valid for maybe a couple of hours, your site will be down for far longer than the actual Dreamhost outage (though if Dreamhost are down for days at a time it might be win).

      --
      Matthew @ Bytemark Hosting
    2. Re:You Miss The Point by Slashdot+Parent · · Score: 1

      If you have any clients on big hosts that assume 1-day or 1-week DNS,

      I've used two behemoth ISPs, and neither of them cache DNS entries for an entire day. It's more like an hour.

      Again, this website is not mission-critical. No applicant is going to care if my website is down for a few minutes or even an hour. Most of them aren't that great with computers, anyway (or don't even own one). I know this, because I do rent-to-own computers, furniture, appliances, etc. for them as a side business.

      My only point in responding to you was that there is plenty of room in the hosting biz for both full-time and on-demand services. Amazon is not going to eat your lunch, but you'll never eat theirs, either. ;)

      --
      They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock
  48. Re:You don't even actually save money by using clo by bhanson · · Score: 1

    This post is completely retarded on so many levels, let me explain.

    Yes, Amazon's prices are slightly more expensive on a point-to-point basis, but they're completely different models and cannot be compared directly. Amazon costs more because the technology is far more advanced than something traditionally used. This requires more engineering to design and run, therefore increasing the cost. The savings comes from services that have irregular usage patterns. You're only paying for what you use, where as with a more traditional hosting model, you're paying for your maximum commit, or rather the amount of resources you need for your peak traffic. The rest of the time you're paying for resources you're not using.

    If you know exactly how much resources your service will consume then sure, something not-cloud will be cheaper. But if your service is fairly dynamic by nature then suddenly it gets a whole lot more competitive. If your service levels vary widely, then Amazon quickly becomes the most fiscally responsible choice.

    For file hosting, the suggestion of using shared hosts has to be the absolute worst solution ever proposed. In fact, this solution is so bad I question your "7 years of solid hosting experience" and also your common sense as a human being.

    Shared hosts all have a few things in common:

    1. Disallow file hosting. Practically every shared host explicitly forbids file hosting on their shared plans. Doing so will get your account suspended.
    2. They're unreliable, you share a server with hundreds of other customers and just 1 could destroy the speed and reliability of the connections to your content.
    3. Their HTTPd services are configured to run websites, not serve files. The memory footprints for serving many simultaneous files is drastically larger than it should be on one of these hosts.

    You seem to have no concept of the difference between the "bandwidth" you buy from a shared host and the "bandwidth" you buy from a real solution.

    "OmG, Dreamhost offers unlimited storage and transfer, why doesn't Sourceforge move their entire website to them for like $6 bucks a month?!?!"

    Is not too far of a stretch of what you sound like to anyone with half a clue.

    To address your specific example, here is a quote from Dreamhost ToS:

    What's not allowed in "Unlimited"? Basically, sites whose essential purpose is to use disk or bandwidth.

    And here's a quote from Bluehost:

    Please note, however, that the BlueHost.Com service is designed to host websites . BlueHost.Com does NOT provide unlimited space for online storage, backups, or archiving of electronic files, documents, log files, etc., and any such prohibited use of the Services will result in the termination of Subscriber's account, with or without notice.

    Yup, they both disallow file hosting. Never read a ToS before? Try running one of these nasties by your lawyer before signing up next time, you'll be surprised at what they say.

    Any company with the sole purpose of having an Internet prescence should be using a CDN to distribute their content, period. There's no excuse for even the smallest of startups now with the drastically falling prices.

    It's not even all about the price. Using a proper file serving service instead of "cheap shared accounts" will increase the speed and reliability of the connections. Try pumping out tons of throughput from a shared package, and then from a CDN... it's not even a competition.

    Cliffnotes: Poster is a drastically misinformed.

    (notes: my apologies if this post sounds condescending or offensive, it's not meant to do anything but expose the truth and to stop the spreading of bad information)