Slashdot Mirror


Inside Amazon's Cloud Computing Infrastructure

1sockchuck writes: As Sunday's outage demonstrates, the Amazon Web Services cloud is critical to many of its more than 1 million customers. Data Center Frontier looks at Amazon's cloud infrastructure, and how it builds its data centers. The company's global network includes at least 30 data centers, each typically housing 50,000 to 80,000 servers. "We really like to keep the size to less than 100,000 servers per data center," said Amazon CTO Werner Vogels. Like Google and Facebook, Amazon also builds its own custom server, storage and networking hardware, working with Intel to produce processors that can run at higher clockrates than off-the-shelf gear.

14 of 76 comments (clear)

  1. What Does This Mean by Frosty+Piss · · Score: 4, Interesting

    working with Intel to produce processors that can run at higher clockrates than off-the-shelf gear.

    What does this mean? They have custom chips? Custom mods at the chip fab level? Or are they taking advantage of designed-in features that are locked out for normal chip users? Are they simply over-clocking? Or are there features that can be unlocked with money?

    --
    If you want news from today, you have to come back tomorrow.
    1. Re:What Does This Mean by Anonymous Coward · · Score: 3, Interesting

      They must get chips that have been tested for overclocking.

    2. Re:What Does This Mean by Spy+Handler · · Score: 4, Insightful

      Probably means they buy in bulk, so they get to pick the more overclock-able chips.

      Say, Core i7 xxxx runs at 3.0ghz and i7 yyyy chip runs at 3.4ghz. They make a batch of i7s and test them at 3.4ghz. Some barely pass QC and are sold as retail i7 yyyy. Some fail at 3.4ghz so they're marked as i7 xxxx 3.0ghz. Some pass at 3.4ghz with flying colors, these are the ones overclockers want the most. Retail buyers like us don't get to pick which ones we get when we buy the i7 yyyy, but Amazon might.

    3. Re:What Does This Mean by bobbied · · Score: 4, Insightful

      They are building custom hardware and a lot of it so they get a bit of special treatment from Intel.

      You engineer the thermal paths and better control how you get rid of heat. You tweak the board layout for the best performance of the chipset and CPU and run closer tolerances on voltages and clock frequencies while keeping it small. Buying in bulk also lets you customize the chipset and CPU packaging to get you better performance/watt and higher density by eliminating all the "fluff" stuff you really don't want on the cloud machine. Who needs all those USB controllers, PCI-e busses, and sound cards you find in your average server chassis in a high density server farm that just take up space and suck power? Just give me a couple of NIC's, a SATA connection and a serial console and a way to reset an individual system and I have what I need to stand up an OS and grant somebody external access to it.

      --
      "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
    4. Re:What Does This Mean by lgw · · Score: 2

      But of course these are all Xeon processors. Those normally have a lower clock rate the more cores the chip has, to limit heat density. The 10-core processors run a bit more than half the speed of the 2-core (IIC, but I could be way off). You don't need to overclock these in the way you do enthusiast parts, when they're underclocked to begin with. You do need prodigious cooling.

      --
      Socialism: a lie told by totalitarians and believed by fools.
  2. I has more better questions by s.petry · · Score: 2, Informative

    “Every day, Amazon enough new server capacity to support all of Amazon’s global infrastructure when it was a $7 billion annual revenue enterprise,” said James Hamilton, Distinguished Engineer at Amazon, who described the AWS infrastructure at the Re:Invent conference last fall. “There’s a lot of scale. That volume allows us to reinvest deeply into the platform and keep innovating.”

    Did they use AWS for translation on this paragraph? How do you have "a lot of scale"? One can scale up or down, but is this like a computer hokey pokey? Scale is a verb!

    Really, I skimmed this one pretty lightly. It looks like a marketing article, not a technical article. Buzz words a plenty, so I'm guessing your question is answered by "marketing"..

    --

    -The wise argue that there are few absolutes, the fool argues that there are no probabilities.

    1. Re:I has more better questions by ShanghaiBill · · Score: 4, Funny

      Did they use AWS for translation on this paragraph? How do you have "a lot of scale"? One can scale up or down, but is this like a computer hokey pokey? Scale is a verb!

      Any verb can be nouned.

    2. Re:I has more better questions by greenreaper · · Score: 2

      I learnt that just yesterday! It's called nominalization.

    3. Re:I has more better questions by lgw · · Score: 5, Funny

      Scale is a verb!

      As I weigh this fish scale on my scale, before cleaning the scale off my kettle, while listening to my neighbor play scales, I wonder about the scale of your intoxication: on a scale of one to potato, how high are you right now? Oh well, I'm off to work: I was hoping for better, but it pays scale.

      --
      Socialism: a lie told by totalitarians and believed by fools.
  3. Re:AWS' problem is not the infrastructure... by turbidostato · · Score: 3, Insightful

    "It's the fact that they only focus on infrastructure"

    You are not looking carefully enough.

    "In that sense, Microsoft is far, far ahead of the others"

    You know what happens with the ones too far, far ahead of others? In the future, people rise statues honoring them, but they usually die poor and/or too young.

    It's quite funny you talk about Microsoft since, back in the day, it was Novell the one far, far ahead of Microsoft on PC-based client/server deployments. And know what? Microsoft not only didn't give a damn but they mocked Novell as too complex. And they were right: most people wasn't ready for Novell forests and inherited/nested permissions and Windows for Workgroups was everything they could cope with. Then they grew up to "classic" domains, still tad simpler than Novell while still being "good enough" for their customer base (in fact, being not only "good enough" but "top notch" since for most of them it was all they knew as in practical terms it was Microsoft itself the one "educating" them).

    Eventually, Novell died and, who could think about it!? the very next day Microsoft came up with their new and shinny Active Domains that were basically what Novell had been doing since ten years before: now, somehow, that wasn't "too complex" anymore but the only true way.

    I'd say Amazon is exactly on the same track today: on one hand, most people, as you say, is not ready yet for higher abstraction levels like PaaS, IaaS is good enough and strongly growing. On the other hand, PaaS market is far from mature enough: writing code against any public API today is guaranteed to have it rewritten even before the provider gets to declare it non-beta.

    And there's even more: it's said that in the gold rush, the only ones consistently making money where the shovel shops, not the miners: nowadays, the "hardware store" is Amazon and it is the people building on top of AWS the ones taking the real risks of doing business. And Amazon is not just seeing the time going by: few years back they offered pretty simple virtual machines; now they offer quite a complex landscape with databases, routing, DNS, load balancing, tiered persistent storage... They are the Microsoft of today mocking on the ones too far, far ahead while, at the same time, cultivating their own customer base to make them ready for their future products and services.

  4. What AWS outage demonstrates .. by nickweller · · Score: 2

    "As Sunday's outage demonstrates, the Amazon Web Services cloud is critical to many of its more than 1 million customers"

    I thought the outage demonstrated the relative unreliability of Amazon cloud Services. What are the legally binding terms of services that AWS provide in relation to uptime.

    1. Re:What AWS outage demonstrates .. by Anonymous Coward · · Score: 2, Insightful

      I thought the outage demonstrated the relative unreliability of Amazon cloud Services.

      Incorrect. What was demonstrated was the inability of AWS customers to design fault tolerant systems. Any system that cannot tolerate any downtime should be multi-region.

    2. Re:What AWS outage demonstrates .. by lgw · · Score: 4, Insightful

      Well, it's their second major outage in the ~10 years of AWS. Far better than any in-house IT department I've ever seen.

      --
      Socialism: a lie told by totalitarians and believed by fools.
  5. Re:When will AWS get IPv6 ability? by Coren22 · · Score: 2

    I like your comment, it is quite funny, but to address the question:

    The packets are larger (more bits) so take longer to transmit, and more memory to store. Also, ASICs are built for IPv4, they don't work for IPv6, so much of IPv6 traffic is done in CPU rather than ASICs which is less efficient in power usage.

    I doubt the power difference is terribly high, but at an Amazon level, it would likely be noticeable.

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?