Slashdot Mirror


AWS Load Balancer Sends 2 Million Netflix API Reqs To Wrong Customer

rsk writes "Amazon Web Services' Elastic Load Balancer is a dynamic load-balancer managed by Amazon. Load balancers regularly swapped around with each other which can lead to surprising results; like getting millions of requests meant for a different AWS customer. Using ELBs can result in AWS unintentionally introducing a man-in-the-middle (attack) into your application environment. Most AWS users do not realize this can happen and have not secured against it."

9 of 58 comments (clear)

  1. TTL value by SharkLaser · · Score: 2

    It looks more like some client aren't respecting the DNS TTL value, so technically it's not Amazon's fault. You should stick to standards, and if TTL says it's 60 seconds, then it is.

    1. Re:TTL value by Florian+Weimer · · Score: 2

      Browsers are sometimes forced to disregard TTL values to prevent certain type of attacks which involve quickly changing DNS records.

    2. Re:TTL value by girlintraining · · Score: 2

      It looks more like some client aren't respecting the DNS TTL value, so technically it's not Amazon's fault.

      "Technically", no. But two people pointing a finger at each other and saying "He did it!" doesn't solve anything, and all the customer gets is the finger.

      --
      #fuckbeta #iamslashdot #dicemustdie
    3. Re:TTL value by JWSmythe · · Score: 4, Interesting

          From what I've seen, it's frequently the client's DNS servers, not the client itself.

          I've used a short TTL (5m) for quite a while. It's intentional, because I've needed to switch things rather quickly in the past, and it's better for it to "just work", rather than waiting hours for everyone to pick up the change.

          I used to work for a place that had a huge traffic load. Our slow days were still millions of unique visitors. When we took a machine out of DNS (DNS round robin between 15+ machines), we'd see the traffic drop significantly in the first 5 minutes. When AOL finally saw our change, it would drop more. There would still be lingering people for about an hour, and then it would finally be idle.

          That was a pretty regular thing for us to do. We scaled our traffic to our various datacenters this way. We'd also load test lines and individual servers with it. If it looked like we were running into a bandwidth limitation, I'd throw a few hundred Mb/s down the line, and see how it performed. If it really was, we'd then switch everything away from it to other datacenters until the provider fixed it.

          In all those circumstances, in 5 minutes most (but not all) of the traffic moved. An hour from the change, the remainder had moved.

          I've seen this with my home provider. I let them handle DNS for my home machine, rather than doing it myself. I've made changes, and they don't respect it within 30 minutes. Within about an hour, the new DNS records show properly.

          Google's public DNS servers seem to do pretty well in that respect. Our changes are reflected properly there in just a few minutes. AOL, TimeWarner/RoadRunner, and a few others are pretty bad. I know why they do it (reducing load on their DNS servers), but it becomes a pain in the ass for places that need to make changes quickly.
         

      --
      Serious? Seriousness is well above my pay grade.
    4. Re:TTL value by hedwards · · Score: 2

      If the customer's getting the finger, wouldn't that make it more of an Erotic Load Balancer?

    5. Re:TTL value by Kattare · · Score: 2

      Problem with any of these scenarios is that according to the AWS forum post, he's been getting rogue Netflix traffic for 4 days. No dns server or mainstream client is going to keep a 60 second TTL record for 4 days. It's either an issue at AWS completely unrelated to DNS, or an issue in Netflix clients. With it being in TV's, BluRay players, Xboxes, IOS, Wii's, etc... who knows what client the issue could be in... I wonder if the forum poster could capture the browser string and help debug?

  2. DNS caches for 4 days. by Kattare · · Score: 2

    No dns server (or mainstream browser) caches something for 4 days when given a low TTL. I've seen some that cache for a few hours, maybe up to a day, but 4 days? Really? Something else is going on. I kind of wonder about the Netflix clients built into all those TV's, Mobile Phones, and DVD players.

  3. Re:Why no proxy? by cript2000 · · Score: 2

    F5 supports that functionality. EC2 is not built on any commercial LB vendor.

  4. Security is NOT an issue with The Cloud. by Anonymous Coward · · Score: 2, Funny

    Wait a minute. I'm a manager, and I've been reading a lot of case studies and watching a lot of webcasts about The Cloud. Based on all of this glorious marketing literature, I, as a manager, have absolutely no reason to doubt the safety of any data put in The Cloud.

    The case studies all use words like "secure", "MD5", "RSS feeds" and "encryption" to describe the security of The Cloud. I don't know about you, but that sounds damn secure to me! Some Clouds even use SSL and HTTP. That's rock solid in my book.

    And don't forget that you have to use Web Services to access The Cloud. Nothing is more secure than SOA and Web Services, with the exception of perhaps SaaS. But I think that Cloud Services 2.0 will combine the tiers into an MVC-compliant stack that uses SaaS to increase the security and partitioning of the data.

    My main concern isn't with the security of The Cloud, but rather with getting my Indian team to learn all about it so we can deploy some first-generation The Cloud applications and Web Services to provide the ultimate platform upon which we can layer our business intelligence and reporting, because there are still a few verticals that we need to leverage before we can move to The Cloud 2.0.