Slashdot Mirror


Inside Facebook's Infrastructure

miller60 writes "Facebook served up 690 billion page views to its 540 million users in August, according to data from Google's DoubleClick. How does it manage that massive amount of traffic? Data Center Knowledge has put together a guide to the infrastructure powering Facebook, with details on the size and location of its data centers, its use of open source software, and its dispute with Greenpeace over energy sourcing for its newest server farm. There are also links to technical presentations by Facebook staff, including a 2009 technical presentation on memcached by CEO Mark Zuckerberg."

23 of 77 comments (clear)

  1. Environmentalist by AnonymousClown · · Score: 3, Interesting
    I support environmental causes (Sierra Club and others), but I for one will not support Greenpeace and I don't think they are credible. They use violence to get their message out and their founder is now a corporate consultant that shows them how to get around environmental laws and pollute.

    That's all.

    --
    RIP America

    July 4, 1776 - September 11, 2001

    1. Re:Environmentalist by bsDaemon · · Score: 2, Funny

      Yeah, but if they're against Facebook, they can't be all bad. Sort of like the Mafia vs Castro, right?

  2. Facebook ID by Thanshin · · Score: 3, Funny

    It's time to invent the Facebook Identity card.

    You can't remember your passport number? No worries, your Facebook Identity card will say who you are. And how many friends you've got. And the name of your pet. And whether you went to the bathroom at your usual time that morning. And what kind of men you find attractive.

    Semper Facebook Identity!

    1. Re:Facebook ID by rtaylor · · Score: 3, Interesting

      Facebooks knows anything about you that 3rd parties (friends, family, etc.) might tell them too.

      I didn't create an account or provide any information to facebook; yet there are bits and pieces of information on it about me.

      --
      Rod Taylor
    2. Re:Facebook ID by tophermeyer · · Score: 2, Interesting

      One of the reasons that's keeping me from deleting my facebook account is that having it active allows me to untag myself from all the pictures that I wish my friends would stop making public. If I didn't have an account they could link to, my name would just sit on the picture for anyone to see.

  3. Slashdotted by devjoe · · Score: 2, Informative

    Maybe Data Center Knowledge should put some of that knowledge to work, as the article is slashdotted after only 5 comments.

  4. Freaking SEOs... by netsharc · · Score: 2, Insightful

    Facebook is... Facebook has... fucking SEO monkeys must be at work making sure the company isn't referred to as "it", because that ruins the google-ability of the article, and they'd rather have SEO ratings than text that reads like it's been written by a fucking 3rd grader.

    SEO-experts... even worse than lawyers.

    --
    What time is it/will be over there? Check with my iPhone app!
  5. Mark Zuckerberg's presentation link is wrong by francium+de+neobie · · Score: 2, Interesting

    It links to Facebook's "wrong browser" page. The real link may be here: http://www.facebook.com/video/video.php?v=631826881803

  6. I'm sticking with antisocial networking by Average_Joe_Sixpack · · Score: 2, Funny

    USENET and /. (RIP Digg)

  7. Cache by minus9 · · Score: 2, Informative
  8. Yawn.. move along by uncledrax · · Score: 2, Informative

    The article isn't worth reading IMO, not unless you're curious as to how much electricity some of the FB datacenters use. Otherwise it's light on the tech details.

    --
    ----- The internet has given everyone the ability to have their voice heard equally as loud.. even if they shouldn't be
    1. Re:Yawn.. move along by drsmithy · · Score: 2, Informative

      The article isn't worth reading IMO, not unless you're curious as to how much electricity some of the FB datacenters use. Otherwise it's light on the tech details.

      Indeed. "All you wanted to know about FaceBook's infrastructure" and little more than a passing mention about their storage ? That's vastly more interesting information than where their datacenters might physically be.

  9. Call me dense, but... by mlts · · Score: 4, Interesting

    Call me dense, but with all the racks of 1U x86 equipment FB uses, wouldn't they be far better served by machines built from the ground up to handle the TPM and I/O needs?

    Instead of trying to get so many x86 machines working, why not go with upper end Oracle or IBM hardware like a pSeries 795 or even zSeries hardware? FB's needs are exactly what mainframes are built to accomplish (random database access, high I/O levels) and do the task 24/7/365 with five 9s uptime.

    To boot, the latest EMC, Oracle and IBM product lines are good at energy saving. The EMC SANs will automatically move data and spin down drives not in use to save power. The CPUs on the top of the line equipment not just power down what parts are not in use, but wise use of LPARs or LDoms would also help with energy costs just due to having fewer machines.

    1. Re:Call me dense, but... by njko · · Score: 5, Insightful

      The purpose of server farms with comodity hardware is just to avoid vendor lock-in, if you have a good business but you are tied to a vendor the Vendor has a better business than you. they can charge you whatever they want.

      --
      \n.\n
    2. Re:Call me dense, but... by mlts · · Score: 3, Insightful

      That is a good point, but to use a car analogy, isn't it like strapping a ton of motorcycles together with duct tape and having people on staff to keep them all maintained so the contrivance can pull a 18-wheeler load? Why not just buy an 18-wheeler which is designed and built from the ground up for this exact task?

      Yes, you have to use the 18-wheeler's shipping crates (to continue the analogy), but even with the vendor lock-in, it might be a lot better to do this as opposed to trying to cobble a suboptimal solution that does work, but takes a lot more man-hours, electricity, and hardware maintaining as opposed to something built from the factory for the task at hand.

      Plus, zSeries machines and pSeries boxes happily run Linux LPARs. That is as open as you can get. It isn't like it would be moving the backend to CICS.

    3. Re:Call me dense, but... by Cylix · · Score: 2, Interesting

      The latest x86 architecture lines are moving far more in the direction of mainframe type units in terms of density and bandwidth. This is a hardware type from several years back and would not be really compare to the denser offerings being explored today. However, the reasoning behind commodity hardware is not just the ability to switch to one platform from another, but rather it keeps costs down with vendor competition. One design can be produced by multiple vendors with the goal of earning the lowest bid. There are several other advantages as well with a commodity or generic based design.

      With commodity hardware that is not designed with five nines there is an expectation the application can fail away. The need for the application to fail away gracefully is actually more fundamental then at the server level. When considering application resiliency you want to target at the datacenter level so that you are not locked to a specific region. To build something as large as facebook they are no longer load balancing at the router, but at the datacenter level itself. With this concept the datacenter becomes a bucket entity with the ability to service X traffic and if it should fail you simply move services away. With a sufficiently advanced version of this very generic and very hardware abstracted model it is now possible to distribute load to third party farms via cloud infrastructures.

      Still, the world is not black and white and even within these models there will be small clusters of special purpose hardware for things like data warehousing and reporting. Far more typical I find the larger systems in industries where there can be no possible downtime or the loss of data cannot occur.

      --
      "You should always go to other people's funerals; otherwise, they won't come to yours." -- Yogi Berra
    4. Re:Call me dense, but... by mlts · · Score: 2, Interesting

      Actually neither. Its just that to an observer like me, FB is trying to reinvent the wheel on a problem that already has been solved.

      Obviously, IBM is not cheap. Nor is Oracle/Sun hardware. However, the time and money spent developing a large scale framework on the application layer is not a trivial expense either. It might be that the time FB puts in trying to deploy something uncharted like this may cost them more in the long run.

    5. Re:Call me dense, but... by RajivSLK · · Score: 3, Interesting

      Well we do the same thing as facebook but on a much smaller scale... Our "commodity hardware" (mostly supermicro motherboards with generic cases, memory etc) has pretty much the same uptime and performance as vendor servers. For example we have a Quad CPU database server that has been up for 3 years. If I remember correctly it cost about 1/2 as much as a server with equivalent specs from a vendor.

      The system basically works like this. Buy 5 or so (or 500 if you are facebook) servers at once with identical specs and hardware. If a server fails (not very often) there are basically 4 common reasons:

      1) Power supply or fan failure -- very easy to identify.
          Solution: Leave server down until maintenance day (or whenever you have a chance) swap for a new power supply (total time 15min [less time that calling the vendor tech support]).

      2) Hard drive failure -- usually easy to identify
          Solution: Leave server down until maintenance day (or whenever you have a chance) swap for a new hard drive (total time 15min [less time that calling the vendor tech support]). When the server reboots it will automatically be setup by various autoconfig methods (bootP whatever). I suspect that facebook doesn't even have HDs in most servers.

      3) Ram Failure -- can be hard to indentify
          Solution: Leave server down until maintenance day (or whenever you have a chance) swap for new ram (total time 15min [less time that calling the vendor tech support]).

      3) Motherboard Failure (almost never happens) -- can be hard to indentify
          Solution: Replace entire server -- keep old server for spare parts (ram, power supply whatever)

      I don't really see what a vendor adds besides inefficiency. If you have to call a telephone agent who then has to call a tech guy from the vendor who then has to drive across town at a moments notice to spend 10 minutes swapping out your ram it's going to cost you. At a place like facebook why not just hire your own guy?

    6. Re:Call me dense, but... by TheSunborn · · Score: 2, Interesting

      The problem is that for any specific budget* the x86-64 solution will give you more aggregate io and more processor hardware then the mainframe. The argument for the mainframe is then that the software might be more easy to write but there don't exists any mainframe which can serve even 1/10 of Facebook so you need to cluster them anyway. And if you need to special cluster magic you might as well have x86-64.

      And IBM will not promise you 99.999% uptime if you buy a single mainframe. If you need that kind of uptime you need to buy multiple mainframes and cluster them.

      *Counting in either rackspace used or money paid for hardware.

  10. How many times a day do people check Facebook? by Comboman · · Score: 2, Interesting

    "690 billion page views to its 540 million users in August"? Good lord, that's 1278 page views PER USER in just one month! That's (on average) 41 page views per user, per day, every single day! The mind boggles.

    --
    Support Right To Repair Legislation.
    1. Re:How many times a day do people check Facebook? by Overzeetop · · Score: 2, Interesting

      Have you seen how often Facebook crashes /has problems? you have to constantly reload the thing to get anything done. Thank goodness Google Calendar doesn't have that problem or I'd probably have a thousand hits a day to my calendar page alone.

      Also, FB pages tend to be pretty content-sparse. It's not uncommon for me to hit a dozen pages in 2-3 minutes if I check facebook.

      --
      Is it just my observation, or are there way too many stupid people in the world?
  11. infrastructure secrecy versus openness by peter303 · · Score: 2, Interesting

    Its interesting how FB is open about their data server infrastructure while some places like Google and MicroSoft ware very secretive. It is competitive for Google to shave every tenth of second off of a search they can through clever software and hardware. They are an "on ramp" to the Information Super Highway, not a destination like FB. And because Google is one of the largest data servers on the planet, even small efficiency increases translate in mega-million-dollar savings.

  12. data servers = industrial engines of 21st century by peter303 · · Score: 2, Interesting

    When these data centers start showing up as measurable consumers of the national power grid and components of the GDP, you might consider them metamorphically as power-plants of the information industry.

    In his book on the modern energy industry "The Bottomless Well", author Peter Huber places commodity computing near the top of his "energy pyramid". Peter's thesis is modern technology has transformed energy into ever more sophisticated and useful forms. He calls this "energy refining". At the base of his pyramid are relative raw energy like biomass and coal. The come electrivity, computing, optical, etc. I think its interesting to view computing a refined form of energy.