Slashdot Mirror


Amazon Forced To Reboot EC2 To Patch Bug In Xen

Bismillah writes AWS is currently emailing EC2 customers that it will need to reboot their instances for maintenance over the next few days. The email doesn't explain why the reboots are being done, but it is most likely to patch for the embargoed XSA-108 bug in Xen. ZDNet takes this as a spur to remind everyone that the cloud is not magical. Also at The Register.

5 of 94 comments (clear)

  1. Compared to Azure by Anonymous Coward · · Score: 3, Informative

    It's funny for me to read that Amazon is notifying its users of an impending reboot.

    I've been suffering with Azure for over a year now, and the only thing that's constant is rebooting....

    My personal favorite Azure feature, is that SQL Azure randomly drops database connections by design.

    Let that sink in for a while. You are actually required to program your application to expect failed database calls.

    I've never seen such a horrible platform, or a less reliable database server...

    1. Re:Compared to Azure by CodeReign · · Score: 5, Insightful

      You are actually required to program your application to expect failed database calls.

      Yes, of course you are. Only an idiot would expect 100% of db calls to be successful.

    2. Re:Compared to Azure by Aaden42 · · Score: 4, Insightful

      Be sure to thank Microsoft for teaching you the value of robust error checking. Assume any other host you need to talk to was nuked from orbit five seconds ago. Write your code to bounce back from that to the degree possible.

      At the very least, DB *connections* should be assumed to have evaporated since the last time you accessed them. Use some sort of pooling library that can deal with that transparently if you like, or just catch & retry if necessary.

      Seriously though, sounds like the environments you’ve worked in have been simple enough with low enough transaction volume that you got lucky & everything just worked. DB & app server on the same box maybe? Dealing with temporarily unavailable external hosts is just part of writing multi-tier code.

    3. Re:Compared to Azure by Shados · · Score: 4, Insightful

      if you're in an transaction and it fails, you can just redo it. Thats the whole damn point.

    4. Re:Compared to Azure by Just+Some+Guy · · Score: 3, Informative

      When hosting your app in the cloud, regardless of provider, it is considered best practice to design for failure.

      Netflix goes so far as to randomly kill services throughout the day. Their idea is that it's better to find systems that aren't auto-healing correctly by testing recovery during routine operations than to be surprised by it at 3AM. It's successful to the point that you generally don't know that the streaming server you were connected to has been killed and a peer took over for it. That is how you make reliable cloud services.

      --
      Dewey, what part of this looks like authorities should be involved?