Slashdot Mirror


The Computer Science Behind Facebook's 1 Billion Users

pacopico writes "Much has been made about Facebook hitting 1 billion users. But Businessweek has the inside story detailing how the site actually copes with this many people and the software Facebook has invented that pushes the limits of computer science. The story quotes database guru Mike Stonebraker saying, 'I think Facebook has the hardest information technology problem on the planet.' To keep Facebooking moving fast, Mark Zuckerberg apparently instituted a program called Boot Camp in which engineers spend six-weeks learning every bit of Facebook's code."

113 comments

  1. Oh bullshit. by Anonymous Coward · · Score: 5, Insightful

    The story quotes database guru Mike Stonebraker saying, 'I think Facebook has the hardest information technology problem on the planet.'

    Really? You think keeping track of some people's dinner plans is the hardest IT problem on the planet? How about YouTube storing and serving truly ludicrous amounts of video. Web search? Watson?

    Facebook is utterly trivial compared to many problems out there.

    1. Re:Oh bullshit. by AdamWill · · Score: 2

      Indeed. Handily proven by "To keep Facebooking moving fast, Mark Zuckerberg apparently instituted a program called Boot Camp in which engineers spend six-weeks learning every bit of Facebook's code.""

      a) that's a terrible idea, and b) the fact that it's even possible (if it is, sounds like business magazine bs to me) speaks volumes. I only work for Red Hat, we're pretty cool but we're hardly the biggest fish out there, and you can imagine the chaos if we tried that...I'm sure others can apply it to their companies with similar results.

    2. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      Yeah, building Facebook is a piece of cake. http://www.quora.com/Facebook-Engineering/What-is-Facebooks-architecture

    3. Re:Oh bullshit. by stephanruby · · Score: 4, Informative

      "Mark Zuckerberg apparently instituted a program called Boot Camp in which engineers spend six-weeks learning every bit of Facebook's code."

      Ah that's Zuckerberg's secret sauce apparently, plenty of overtime for six-weeks so that a new engineer can learn every bit of Facebook's code. This way, they can push the limits of computer science (or disregard them completely) and ignore the lessons from the Mythical Man Month.

      I cringe to think that many business people will actually take BusinessWeeks' article seriously.

    4. Re:Oh bullshit. by damn_registrars · · Score: 1

      Really? You think keeping track of some people's dinner plans is the hardest IT problem on the planet? How about YouTube storing and serving truly ludicrous amounts of video. Web search? Watson?

      Facebook is utterly trivial compared to many problems out there.

      While I happen to agree with you, none of those other problems get daily front-page attention on slashdot. While facebook is one of the least interesting problems in computer science, it has been a staple of slashdot discussion ever since facebook became a staple of everyday conversation (or perhaps ever since the creator of facebook surpassed cmdrtaco in net worth).

      --
      Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
    5. Re:Oh bullshit. by Anonymous Coward · · Score: 1

      After reading the details, I'm actually less impressed. varnish, apache, hadoop, php.. bfd

    6. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      whoops.. not apache.. but varnish + memcache + hadoop + mysql.. bfd

    7. Re:Oh bullshit. by Dan+East · · Score: 5, Informative

      Actually, Facebook's problem isn't trivial in any sense of the word. The complexity and joins of various database tables must be insane. With YouTube it's all about raw bandwidth, which actually is a fairly easy problem to solve especially since 99% of that data is static. You just physically distribute it and throw money / resources at the problem. As far as database structure, any CS student should be able to reproduce the bulk of it in a single day. You have videos associated with users, and comments associated with videos, etc. The gist of it is straightforward.

      Now let's talk about Facebook. There is no compartmentalization of the data. You've heard the "six degrees of separation", whereby any two people on the planet can be socially connected to one another in at most 6 steps. Well, with Facebook, the average degree of separation between any two people is 3.74. What that means is everyone is very closely networked, all the data is dynamic (or more specifically, the data the users really care about is the dynamic and most recent data), and since many people (myself included) open up their information to "friends of friends", there is a tremendous amount of data that any one person can potentially have access to. Even Google searches don't have this problem, because the bulk of the common search terms can be preprocessed for easy retrieval, and having data that's an hour or two old isn't a huge issue.

      So you have this massive database (1 billion users, each with many different types of associated data - posts, images, videos, things they've liked, things they've shared, etc, etc), and each of those 1 billion users has an entirely different set of friends from which recent (basically real-time) data must be polled - over, and over, and over again, all day long. Now, throw in the very complex privacy rules, as to which types of posts can be seen by which types of friends, groups, block lists, etc, and the problem becomes very, very complex. Sure, most of us could bang out something with that core functionality without too much difficulty, but to make it work nearly real-time for 1 billion users at once? That's an incredible undertaking.

      --
      Better known as 318230.
    8. Re:Oh bullshit. by RightSaidFred99 · · Score: 1

      Agreed. And the low threshold for acceptable eventual consistency and lack of important of the data (overall) makes it less complex that it would otherwise seem.

      Wall Street's types of issues make Facebook look like "Hello World."

    9. Re:Oh bullshit. by toastking · · Score: 2

      Facebook isn't just about status updates. They have a whole robust API they use to interact with apps and other websites. It hosts music, events, photos, videos, app data, along with tons of user data with timeline. You can share anything from a cat video to a milestone of you losing weight. Serving up all that data in quick and well-presented manner to millions of people around the world is very difficult.

    10. Re:Oh bullshit. by russotto · · Score: 4, Funny

      Hard to believe it takes so long to learn Facebook's code. I work at Google, and I learned every bit of Google's code in one day.

      I don't think I'm giving away the store when I tell you the bits were '0' and '1'.

    11. Re:Oh bullshit. by jittles · · Score: 2

      Oh yeah. I worked on an embedded project that had custom kernel code as well as over 2 million lines in system libraries. No one could possibly know every single line of that. The project I was in charge of there maybe had 200,000 lines of code, and I often had to rely on comments to remember what goes where! I had the unfortunate aspect of being the only team on an embedded processor and had to fix cross platform issues with the system libraries too. It was a lot of work.

    12. Re:Oh bullshit. by kestasjk · · Score: 2, Funny

      Pff. Apache + hadoop + mysql + varnish. Easy.

      The other day I had to write a red-black tree in my CS152 class, now that's a tough problem!

      --
      // MD_Update(&m,buf,j);
    13. Re:Oh bullshit. by jittles · · Score: 4, Interesting

      Except that I don't believe they have 1 billion real users. They probably have 100m users and another 900m users in fake accounts people use to play Farmville, etc.

    14. Re:Oh bullshit. by rtaylor · · Score: 5, Interesting

      It's made infinitely easier by being asynchronous and 99% reads. There are no timing issues. If a post is delayed to someones screen by a minute or two, nobody dies.

      It's not terribly difficult to make numerous (near infinite) read-only replica's of a database which are within tens of milliseconds of the primary; so that takes care of 99% of their problems.

      Handling their write load is harder but keep in mind the vast majority of their accounts are idle; and again asynchronous writes make it much much easier. They can shove everything through a message queue and put heavy-weight sharding of the data behind that.

      I think handling 100 Million banking customers in 2000 was infinitely harder than Facebook has it from a technical standpoint.

      --
      Rod Taylor
    15. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      ... but it doesn't have to be anywhere close to complete, accurate, timely, or right..

    16. Re:Oh bullshit. by bonehead · · Score: 1

      I know they have at least one account that isn't "used", but simply sits there for that one time every 18 months or so a family member posts a picture that I actually have an interest in looking at.

      I'm sure I'm not the only one out of those billion that doesn't give a crap about information being transmitted to me in real time.

    17. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      While 1 billion users is non-trivial manage, FB is far from the "hardest", so I agree. Take for example MMOGs like Eve Online or Vendetta Online. Both of these boasts a single, unsharded game world, and the last time I checked, Eve Online holds the world's record for the largest number of SIMULTANEOUS connections in a SINGLE game world, and this is a fully 3-D world, not some 2-D web page with text. This about this people. This is talking about a 3-D world spread across different servers where mathematical calculations for the graphics, phyiscs, and AI take place, not just someone putting up a blog. This is a FAR cry (not the game) from FB. The problems that these (unsharded) types of games have to solve is far and above and beyond the types of problems FB has to solve. Even a sharded game like the 2-D game Maple Story is more complex, and they have an astonishing over 90 millions users. This alone is far more complex than managing everybody's web page. Of course, I'm simplifying things because an account on FB has more than a web page, but it should be completely, completely obvious that FB is definitely not the "hardest" type of problem in computer science. It's a laugh that someone could even say that such a thing.

    18. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      (or perhaps ever since the creator of facebook surpassed cmdrtaco in net worth).

      Which took about 5 minutes to accomplish?

    19. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      And the number of people who care about what you believe beyond yourself is how many exactly? 0?

    20. Re:Oh bullshit. by Lunix+Nutcase · · Score: 1

      And your opinion carries any weight, why? You realize that Facebook can pretty easily mine their massive data to link these duplicate Farmville accounts to the real accounts, right? This is pretty basic data analysis that companies like Facebook, Google, etc. can do. And their 1 billion active users is after taking out all the fake and duplicate accounts.

    21. Re:Oh bullshit. by Anonymous Coward · · Score: 1

      Shrug. It's a hard problem, but it's hardly a unique problem. I work on Google ads backend. Superficially this is a very different system from Facebook, which is all frontend... and yet every problem you describe there is one that we have had to solve at similar scale. (Yes we have an order of magnitude fewer users for example, but our users have an order of magnitude more data, and it is easier to shard many small users than a smaller number of goliath users.)

      Once you get big enough, the problems shared by any two big systems start looking pretty damn similar...

    22. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      "I know they have at least one account that isn't "used", but simply sits there for that one time every 18 months or so a family member posts a picture that I actually have an interest in looking at."

      Ditto here, I create a new one each time to watch the crap.
      There must be millions alone for the guys who sell fake LIKEs or friends on eBay.

    23. Re:Oh bullshit. by Intrepid+imaginaut · · Score: 4, Insightful

      You think so? One person in six on this earth, including infants and the elderly in developing countries without regular internet access has an active facebook account do they? Facebook's numbers have never been properly audited, its not in their best interests to do so. The more users they can claim, the better for them. I would agree with possibly a couple hundred million, but I have a really hard time believing much more than that.

    24. Re:Oh bullshit. by geekymachoman · · Score: 1

      The story quotes database guru Mike Stonebraker saying, 'I think Facebook has the hardest information technology problem on the planet.'

      Really? You think keeping track of some people's dinner plans is the hardest IT problem on the planet? How about YouTube storing and serving truly ludicrous amounts of video. Web search? Watson?

      Facebook is utterly trivial compared to many problems out there.

      +5 insightful ? Seems like slashdot have a bug somewhere. Should be +5 Ignorant. Seriously.. this is so wrong it's crazy.

    25. Re:Oh bullshit. by hairyfish · · Score: 2

      It's actually 1 in 7 now ;) I used to work for an ISP back when there lots of them, and we used to offer one month free for new members. Most people quit after the first month, but that didn't stop us advertising how many customers we had in our database. I'd be willing to bet a lot of money FB is doing the same thing. I have 3 FB accounts myself, one I use, one I use for signing up to all those crap services which only let you use an FB account (hello Spotify) and like to spam your status, and another for testing things that has no connection to anything else of mine. None of them have real names.

    26. Re:Oh bullshit. by Anonymous Coward · · Score: 1

      Facebook's problem IS trivial compared to the problems that deserve solving on this planet. Facebook does not solve a single real business problem (except their own). Technically, what they do may be a challenge but it doesn't contribute to anything except let some kids looking for attention share stuff that nobody cares about. If you ask me, this is a waste of resources, technically, intellectually, and energy-wise.

    27. Re:Oh bullshit. by Anonymous Coward · · Score: 1

      Bfd, huh? You should drop that Zuckerberg guy a line and let him know they can just fire 3,500 of some of the finest IT staff and programmers in the world.

      "I think Facebook has the hardest information technology problem on the planet," says Mike Stonebraker, a computer scientist and longtime professor at the University of California at Berkeley. "A company like Google certainly does innovative stuff, but Facebook solves the harder problem."

      Each day, Facebook processes 2.7 billion "Likes," 300 million photo uploads, 2.5 billion status updates and check-ins, and countless other bits of data, and uses that mass of transactions to guesstimate which ads to serve up.

      And let's not forget, it's constantly figuring out which of those items to show each of those 1 billion, variously-connect people.

      Maybe you could just handle all that high frequency trading for the large exchanges in your off-hours too, for extra cash.

    28. Re:Oh bullshit. by StripedCow · · Score: 1

      I don't think I'm giving away the store when I tell you the bits were '0' and '1'.

      Given the fact that '0' stands for 'not-evil' and '1' stands for 'evil', the important question of course is: did you count those 0's and 1's and what is their frequency?

      --
      If Pandora's box is destined to be opened, *I* want to be the one to open it.
    29. Re:Oh bullshit. by dzfoo · · Score: 1

      LOL!

      Thank you for that. It made my day!

                -dZ.

      --
      Carol vs. Ghost
      ...Can you save Christmas?
    30. Re:Oh bullshit. by dzfoo · · Score: 3, Interesting

      I wrote a red-black tree for fun the other day. What's the problem?

      --
      Carol vs. Ghost
      ...Can you save Christmas?
    31. Re:Oh bullshit. by hceylan · · Score: 1

      As much as I hate Facebook, and I believe the number of true Facebook profiles are less then 250M, "To Caesar What Is Caesar's". Just because you think the added value Facebook creates is not rocket science, Facebook not only does use high tech software architecture but also creates software technology and delivers some as open source. I would recommend you read http://royal.pingdom.com/2010/06/18/the-software-behind-facebook/ And trust me when your scale goes above 10 digit numbers nothing is trivial.

      --
      -- Hasan CEYLAN Batoo Software & Consultancy Have you checked out Batoo JPA - http://batoo.jp Batoo JPA is ~15x f
    32. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      >

      I think handling 100 Million banking customers in 2000 was infinitely harder than Facebook has it from a technical standpoint.

      Trivial. Banking works in batches. Which is why they have the best machines outside of research and military.

    33. Re:Oh bullshit. by oztiks · · Score: 1

      Not sure if I completely see what your saying making too much sense.

      1bn users accessing a DB which, Yes, polls and manages a large amount of data, YES. However, you just slammed YouTube for doing pretty much the same thing but only exponentially better. Lets not forget massive amounts of comment management, video relevancy tools, algorithms that automatically scour video clips for copyright infringement and convert text to speech, etc so on an so fourth.

      You bounced from the statement from " The complexity and joins of various database tables must be insane" to the statement "each of those 1 billion users has an entirely different set of friends from which recent (basically real-time) data must be polled - over, and over, and over again". That last statement contracts your comparison to YT.

      Now for my take on it, what Facebook does, it does badly, if their search tool; their ad placement engines; their "3.74" degrees of separation algorithm, which LinkedIn does better than; all achieved say a Google level implementation of similar feature sets then I would tend to agree with you, it doesn't, it's half the reason why the company is fumbling as it cant do these things at a level that makes sense to the end user many times. My wife had Russian Bride ads show up in her FB photo gallery last night, WTF?

      Also lets not mention that the most laborious parts of Facebook I.E their Maps and translation services are run primarily through Bing / Microsoft and the rest of their intensive services such as games are done off their platform and done via their API.

    34. Re:Oh bullshit. by klubar · · Score: 1

      Unlike many other databases, errors can be tolerated in facebook. If a post gets lost or a connection or two dropped it really doesn't cost Facebook anything--and it's unlikely to be noticed. And downtime and retries are tolerated by the users.

        Try running a real-time, financial system like credit card authorization & processing (which probably has more than 1 billion users), needs to balance at the end of the day and has response requirements measuring under 250 ms.

        Facebook is just better at promotions. There are other databases that are bigger, have tighter response requirements and are more complex. It's all about buzz.

    35. Re:Oh bullshit. by Anonymous Coward · · Score: 0

      Not sure about the 100M banking customers - because that problem is easily partition-able. With the exception of money xfer between accounts belonging to different customers, which likely are handled as 2 transaction (debit and credit), activities on any one account impact no other accounts, so if you run out of vomputing power, break the accounts into N groups, each with their own separate store, db instance, whatever, and voila! with social graphs, partitioning the graph universe is harder, which is why a solution that works for 100M users doesn't 'just work' for 1B by setting up 10 times the number of instances.

    36. Re:Oh bullshit. by rtb61 · · Score: 1

      Lets not forget product accounts, accounts to market products. Then redundant accounts, people changed names. Try and forgotten accounts because it is so hard to erase your data. Of course there is always Facebooks rather crappy and falling share price 'hmm' 1 billion users, that'll pump up the share price. So Facebook should have been honest with the investing public and declared how many active user accounts, not presented information to the public with an intent to deceive non-inside investors (those ass hats selling shares). The SEC should conisder an investigation to review the truth or falsity of that statement "1 billion" users due to the impact upon investors.

      --
      Chaos - everything, everywhere, everywhen
  2. Fsck Facebook by Anonymous Coward · · Score: 0

    Fsck Facebook.

  3. 1 billion users by Anonymous Coward · · Score: 5, Funny

    I totally believe that Facebook has 1 billion users... because I am 4 of them.

    1. Re:1 billion users by tomhath · · Score: 0

      Okay, so it's actually 999,999,997. Does that make you feel better?

    2. Re:1 billion users by flimflammer · · Score: 2

      Put me down for 6.

    3. Re:1 billion users by PeanutButterBreath · · Score: 2

      Would it make you feel worse if the number was a "mere" 250m? Or 100m?

      I am currently ignoring 2 different accounts, FWIW. Facebook keeps sending notifications of various uninteresting types to both, I assume that they are both considered "active".

      I joined with a buch of real life friends years ago, and it appears that about 1 in 10 ever post anything on a regular basis.

      [Shrugs]

    4. Re:1 billion users by L3370 · · Score: 3, Insightful

      If he can make 4, so can the bozo that wants to create a fake account to for your pets, browsing ex girlfriends, gaming Farmville perks, and avoiding your boss' prying eyes.

      In short, there aren't a billion people on facebook--nowhere near it. An important fact for businesses that are looking to tap into a network of "real" people.

    5. Re:1 billion users by Narnie · · Score: 1

      Put me down for 0.5 since I check it 1 day per fortnight and I don't use any features besides app blocking, ignoring people, and rejecting photo tags.

      --
      greed@All_Evils:~#
    6. Re:1 billion users by Anonymous Coward · · Score: 1

      Facebook ID for voting. Vote early and vote often!

    7. Re:1 billion users by tomhath · · Score: 1

      I'm waiting for the day when Facebook has more "users" than there are people on Earth.

    8. Re:1 billion users by Anonymous Coward · · Score: 0

      723, anyone need friends?

    9. Re:1 billion users by Anonymous Coward · · Score: 0

      I totally believe that Facebook has 1 billion users... because I am 4 of them.

      But you, presumably, are either real, or an independent bot, with an average Walter Mitty approach to Social Networking (and you thought your "friend's" friends were real?). What about the Facebook pages for every entry in Wikipedia, the hundreds for fictional characters including Santa Claus - and that's just the ones created by Facebook itself to boost share prices with fictional user numbers - not the millions created by outside blackhatters (few of which were affected by the recent and highly publicised "cleaning" operations).

      Face Friend account numbers are like Korean World Martial Arts Champions, or French Resistance heroes - multiplied by a factor of 100. Even without allowing for the average idiot who has four accounts simply because they forgot the password or username for the first three.

  4. PHP by Coolhand2120 · · Score: 3, Insightful

    Oh yes, please tell me all about the computer geniuses that wrote the PHP scripts that power facebook!

    1. Re:PHP by Anonymous Coward · · Score: 3, Insightful

      Yea. Because everyone knows no real website could possibly be written in structured, maintainable PHP. Well, except the biggest site on the Internet.

    2. Re:PHP by Anonymous Coward · · Score: 0

      Everyone knows php does not scale, and is highly insecure. How do these guys do it?

    3. Re:PHP by Firehed · · Score: 1

      Also Yahoo and Wikipedia, both of which are in the top five.

      --
      How are sites slashdotted when nobody reads TFAs?
    4. Re:PHP by Anonymous Coward · · Score: 5, Insightful

      PHP has proven to be the best web development kit. It's only persistent failure is the legacy growth of inconsistent api calls. For the rest, it's turing complete, does scale well, and most of all is the best tuned hammer for the job. It delivers.

      In effect, PHP is a huge C api with its own C like language constructs, a layer of abstraction which takes away the mundane and gets you building web sites.

      Now C is hailed for its great power, and not made fun of because of its ability to make real crappy, insecure code.
      PHP however is not hailed for its great power, and made fun of because of its ability to make real crappy, insecure code.

      It's all a matter of perspective. The problem is low level programmers who can't live with the fact people make a billion dollar without obsessing over pointers or garbage collection.

    5. Re:PHP by Anonymous Coward · · Score: 0

      Everyone knows php does not scale, and is highly insecure. How do these guys do it?

      Facebook gets its job done by implementing super-secret PHP compiler that nobody but Facebook themselves use, hence solving performance problem.

      And because everyone on Facebook has real name and identity uploaded on Facebook, any security breach is impossible. You can't get away with h4cking Facebook.

    6. Re:PHP by phantomfive · · Score: 4, Informative

      Their PHP compiler isn't secret. It's open source and freely available.

      --
      "First they came for the slanderers and i said nothing."
    7. Re:PHP by Anonymous Coward · · Score: 1

      PHP has proven to be the best web development kit.[citation required]

    8. Re:PHP by Anonymous Coward · · Score: 0

      I think they have a new thingie.... that converts PHP into C... I'm pretty certain they call it PHC and then compiles it down. And code is code. Repeat after me: Turing Complete. I used to give a big crap about which language it was written in, but after studying compilers in university, and building my own compilers (single pass with single lexeme lookahead), I realised that damn near anything can become binary. How efficient that binary is, is the real trick.

    9. Re:PHP by trawg · · Score: 1

      Oh yes, please tell me all about the computer geniuses that wrote the PHP scripts that power facebook!

      Well, I know PHP bashing is all the rage, so how about the computer geniuses at Facebook that wrote HipHop, their PHP-to-binary compiler?

      I think it is a pretty cool technical thing (and according to their stats it dropped their CPU usage by some significant figure) - and even better, they open sourced it. Like they do with a lot of their stuff.

    10. Re:PHP by stridebird · · Score: 1

      Good post AC. I think you are on to something with your last sentence too. Technical research on the web is a nightmare, because you have to parse the motivation behind the opinions and filter on that too. Low level programmers ... obsessing over pointers or garbage collection ... indeed. These people can come across as enormously well-informed but their opinions are often worthless outside their tiny, unknowable silos.

    11. Re:PHP by Anonymous Coward · · Score: 0

      PHP is the best web development kit. And to prove it I cite the parent post you replied to...

      P.S. The thing about "citations" that egg-heads love to use to "prove" their arguments is with a little bit of money you can get another egg-head to publish whatever you want.

    12. Re:PHP by Raenex · · Score: 1

      The problem is low level programmers who can't live with the fact people make a billion dollar without obsessing over pointers or garbage collection.

      Facebook relies on C++ because PHP is too slow to use it for everything.

    13. Re:PHP by smellotron · · Score: 1

      It's only persistent failure is the legacy growth of inconsistent api calls.

      I can think of a few more which were still relevant around the introduction of PHP 5:

      • Prone to insecurity by design: things like "addslashes" by default (or at all—what a horrible idea), auto-registration user input variables mixed with implicit variable declarations, mysql_real_escape_string(), etc.
      • Weird decisions with language design as it evolved: Destructors were added, but they got called in the wrong order for RAII which implies that they didn't actually understand the purpose of destructors (this was fixed, thankfully). Namespaces were added long after everyone was already following the namespace_function convention of C, creating a schism between new code and old (and not actually providing any new functionality).
      • A very poor alternative to CPAN. DBI had existed for years when PHP's inferior community DB abstraction layer came onto the scene. Smarty was all the rage at one point in time, but turns out it's just as powerful as PHP and therefore not actually useful as a template engine.

      Aside from a low barrier to entry, the real strengths that I see in PHP don't actually have much to do with the language itself:

      • Deployment takes advantage of the filesystem and existing filesystem-based web servers, which is intuitive. IMHO, it is much better to use Apache as a "front controller" (for script lookups and rewrites) than trying to shoehorn that abomination into a "web" framework the way I have seen done in Zend, Django, etc.
      • Language-level integration with HTTP GET/POST data, headers, server information, cookies mean nobody has to do boilerplate. OK, I guess this is an advantage of the PHP language, but conceivably someone could customize any language's interpreter to handle the HTTP "magic" before hitting a single script.
    14. Re:PHP by grantspassalan · · Score: 1

      This is obviously not on topic, but what have I done, that that I should be listed as your foe?

      --
      A sufficiently advanced simulation is indistinguishable from reality.
    15. Re:PHP by Cruxus · · Score: 1

      Yes, and developers agree with that sentiment. PHP has inertia behind it: tons of cheap webhosts and lots of libraries and existing codebases. As a programming language, there are definitely better out there: Python, Ruby, etc.

      --
      On vit, on code et puis on meurt.
    16. Re:PHP by Anonymous Coward · · Score: 0

      Yea. Because everyone knows no real website could possibly be written in structured, maintainable PHP. Well, except the biggest site on the Internet.

      You can also make sculptures out of hardened feces, but would anyone want to buy and maintain that?

    17. Re:PHP by Anonymous Coward · · Score: 0

      A few years back it seemed like every week I would hear about more site breakins as a result of a security hole in PHP. So I never installed it on my personal web server, and had to skip some other things which depended on it. Have they got it all sorted out now? i.e. is it as safe to install as anything else?

  5. Your first mistake... by MrEricSir · · Score: 5, Funny

    ...is looking for meaningful computer science discussion in a business magazine article.

    --
    There's no -1 for "I don't get it."
    1. Re:Your first mistake... by GuyFox · · Score: 1

      The headline was the highlight of the article and it's a pretty bland headline.

    2. Re:Your first mistake... by Pseudonym · · Score: 3, Insightful

      At the risk of stating the obvious, an information technology problem is not the same as a computer science problem.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    3. Re:Your first mistake... by Jane+Q.+Public · · Score: 1

      Facebook is PHP, with other people's back-ends behind it.

      Facebook and "computer science" have little to do with one another, except to the extent that one has absorbed what others have done, rather like an amoeba.

    4. Re:Your first mistake... by amoeba1911 · · Score: 5, Funny

      What the hell did I ever do to you?

    5. Re:Your first mistake... by darkpixel2k · · Score: 1

      What the hell did I ever do to you?

      commenting to remove an accidental 'redundant' mod. sorry.

      --
      There's no place like ::1 (I've completed my transition to IPv6)
    6. Re:Your first mistake... by Anonymous Coward · · Score: 0

      I believe it's your existence, sir.

    7. Re:Your first mistake... by Anonymous Coward · · Score: 0

      I was expecting detailed discussions about ditching SQL and adopting noSQL and some of the details about what had to be changed and how things are indexed and the big O runtime comparison between them. Instead, its all about fluff and nonsense. My first mistake... was looking for meaningful computer science discussion in a business magazine article.

    8. Re:Your first mistake... by Anonymous Coward · · Score: 0

      Well there are many computer scientists publishing about trivial data-management problems.

    9. Re:Your first mistake... by Pseudonym · · Score: 1

      There's still useful research to be done there. Even binary search trees aren't a solved problem.

      --
      sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
    10. Re:Your first mistake... by tlhIngan · · Score: 1

      Facebook is PHP, with other people's back-ends behind it.

      Facebook's source code is PHP. Which is then compiled into C++ (complete with all assets) and then compiled into a native binary. Linking is a huge problem as it produces a huge (multi-gigabyte) executable that is run directly.

      Deployment is another issue - I believe they use a form of Bittorrent to do it, and naturally, the scripts that update from one executable to another don't work completely across the entire server farm - so those failed deployments run the old binaries until someone gets around to redeploying it.

      I believe the PHP-to-C++ compiler is actually open-sourced by Facebook.

  6. Stonebreaker sez by Anonymous Coward · · Score: 0

    Probably out-of-context, as this whole site could be flushed down the toilet and
    not much would happen - ads would'nt get fed to the gullable. Oh, dear.
    Now the mastercard and visa credit cards networks - that is for real and makes
    fb look like child's play. Which it is.

  7. No CS by Anonymous Coward · · Score: 0

    Bits of management, but definitely no CS in that story!

  8. Facebook should offer a CS degree by Anonymous Coward · · Score: 0

    Everything you need to know in only 6 weeks!

  9. Centralized Social Networking = Difficult Problem by Anonymous Coward · · Score: 1

    Social networking maps very nicely to decentralized resources.
    (I know who my friends are, and I can scrape their RSS feeds by myself.)

    When you try to cram all that into one data center, and then try to replicate that across many data centers in real time ... yep, you've got a problem.

    The mistake is in the belief that it's an "information technology" problem.

  10. Read the article, not much CS inside... by file_reaper · · Score: 2

    I'm kinda disappointed... I am truly interested in how Facebook scales and was hoping there would be actual Computer Science related material in the article... Any Facebook employees care to comment? What do you guys do to scale stuff? How about ./'ers from other companies that have to deal with scaling? Hell, how do porn sites scale? I've done the traditional Distributed Systems courses in University but I really wanted to know how it's done in the real world by AWS, Facebook etc...

    1. Re:Read the article, not much CS inside... by Elminster+Aumar · · Score: 1

      Shops anymore tend to scale by throwing RAM and bandwidth at everything... It drives developers crazy because management cares little about what kind of mess they force their developers to ignore due to due-dates. And of course, the only casualties are the developers who were never given a fair shake to start with. Wanna know how something scales? Continual tweaking, and yes, more RAM and bandwidth. It's the only way to scale things anymore.

  11. 1 billion users, analyzed by damn_registrars · · Score: 4, Funny
    --
    Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
  12. Terrible by thePsychologist · · Score: 5, Informative

    The print version is available.

    I don't recommend reading it. There is absolutely nothing in this article about the actual engineering problems behind scaling for this number of users and how these problems are solved. In fact, there is nothing technical at all in this article except for some vague descriptions of the "bootcamp".

    --
    "What lies behind us, and what lies before us are tiny matters compared to what lies within us." Ralph Waldo Emerson
    1. Re:Terrible by Anonymous Coward · · Score: 0

      agreed. better, more technical information is available at the facebook engineer blogs (beware: some of this is outdated)

      http://www.facebook.com/note.php?note_id=23844338919

      http://www.facebook.com/notes/facebook-engineering/mysql-and-database-engineering-mark-callaghan/10150599729938920

  13. another blowjob for zuckerberg by Anonymous Coward · · Score: 1

    I'm sure there are some smart people working on how to mine every last drop of money out of our private lives at facebook, but IT?

    Last I heard, fb uses mysql. That's not cutting edge CS.

  14. It's rather clever by Animats · · Score: 5, Informative

    It's actually a rather impressive setup. Some Facebook architects gave a talk in EE380 at Stanford a few years back. Originally, Facebook's architecture assumed that most "friends" would be regionally local, reflecting Facebook's college-campus origin. That's not how it worked out after some growth. So they have to assemble pages across regions and data centers. There's caching, but there's also active cache invalidation, which they can do because they control both sides of the cache. There's extensive inter-process communication, and it's not HTTP. There's a lot of PHP for the user-facing stuff, but it's compiled with their in-house compiler, not interpreted.

    Facebook's purpose is banal, but the technology behind it is non-trivial.

  15. 1000 Monkeys by Anonymous Coward · · Score: 0

    I always wondered why it seemed like Facebook was written by a bunch of 16 yr old hackers, shipping half-baked buggy code - but know I know the truth: it's written by _thousands_ of engineers shipping half-baked code - every day!

    Still, the Zuck is now worth billions of dollars, so maybe I have something to learn from the whole experiment... ... Nah, I'm still going to test my own code.

  16. not much science here by Anonymous Coward · · Score: 0

    Not much computer science here, I was expecting more technical details from the summary.

  17. I've never even got to buy another server... by Anonymous Coward · · Score: 0

    Nothing I ever made got popular enough to even require more than one server.

  18. Aside from the million petaquads Google deals with by gelfling · · Score: 1

    Is what you meant.

  19. Six weeks? by dohzer · · Score: 1

    With the way Facebook runs, surely it doesn't take much more than a six-hour lecture to learn.

  20. The Computer Science Behind ... by gVibe · · Score: 1

    Facebook's 1 Billion Users is very simple -- There isn't 1 Billion (unique) Living, Breathing, Computer Using Humans on Facebook.

    If you believe there are actually 1 Billion, completely unique users on Facebook, then I need to ask that each of you turn over your Internet Licenses, power down your computers, and find a new hobby. You are just to dangerous to be allowed on the Internet without adult supervision.

    Ever hear of bot nets?

    You know...all those virus infected zombie computers that have been using networks on IRC for years. Well...these bots are now using Facebook, Twitter, and probably any other social network now.

    If you still refuse to believe the facts. Well...I guess you must be OK living your lives being gullible -- unable to reason or use logic to derive a truth that actually makes sense.

    As for the author, Ashlee Vance -- you are a failure as a journalist. You fail, because you are unable to do any real research to uncover real information and only write stories that have potential for getting your story published and hopefully someone will noticed you and throw fame and fortune at you. Keep Dreaming!

    The HackDefendr

    --
    Keywords for the NSA overthrow oppressive regime true believers marathon Manhatten the financial district blueprints I
  21. Billion Users? by macbeth66 · · Score: 2

    How do they count these 'users'? I have six accounts myself and most people I know have at least two. Now, there is a poll for Slashdot; How many Facebook accounts do you have?

    1. Re:Billion Users? by Anonymous Coward · · Score: 0

      While your point is valid, I'm genuinely curious as to why you have six Facebook accounts. And why do most people have at least two?

      I don't know these things because I refuse to have a single Facebook account, much to the dismay of many of my friends and relatives. They think it's anti-social to abstain from FB and that it's an asshole thing for me to do because I'm not taking the time to appreciate poorly photographed pictures of their ugly snot nosed kids. I don't think I've met an extrovert under 30 without a FB account, but I've never heard these acquaintances discuss multiple accounts unless they cheat on their spouses. Do you have six accounts to juggle your six girlfriends or something?

    2. Re:Billion Users? by dzfoo · · Score: 1

      The article does not clearly say how they count them. However, it does suggest that it is not a real, accurate quantification of actual live accounts--more like a statistical figure.

      In the article, the figure is compared to the United Nations announcing the population on earth, so I guess it involves a lot of extrapolation based on subscription rates and usage loads.

      If you read the article, it's a bit comforting that they have absolutely no idea how many real people are actively using the system, nor which one would be the specific "billionth" user.

      Well, comforting if I used Facebook, which I don't, so I really don't care.

                -dZ.

      --
      Carol vs. Ghost
      ...Can you save Christmas?
  22. Let's test you... by NotQuiteReal · · Score: 1

    I don't think I'm giving away the store when I tell you the bits were '0' and '1'.

    Bad News: There will be a test.
    Good News: it is true / false.
    Let's see how your scan-tron scores... R.I.P.

    --
    This issue is a bit more complicated than you think.
  23. Jack Meehoff by Anonymous Coward · · Score: 0

    Facebook acts like these are all live humanoids. Well, as anyone who posted a signup sheet for IM sports in college can attest, 1/3 of everyone who signs up for anything, anywhere, ever, is fake. See also, "Google+"

  24. Jack Meehoff by theoriginalturtle · · Score: 1

    Facebook actually thinks these are one billion distinct humanoids? Zuck is stoopider than his investors look. As anyone who ever posted a signup sheet in a college dorm for IM softball can attest, at least a third of people who sign up for anything, anywhere, ever, are fake.

    --
    ---------------------------------------
    Rotate the pod, please, HAL....
  25. Re:Aside from the million petaquads Google deals w by xQx · · Score: 1

    You misinterpreted the heading - Facebook has the hardest information technology problem on the planet.

    That information technology problem has nothing to do with servers and storage.

    The hardest information technology problem on the planet is: How do the Facebook exec's stop the company going the way of Silicon Graphics (NYSE: SGI) - oh wait, no, (DELISTED by NYSE because the share price couldn't stay above $1: SGI); since the company creates no real value, and has done nothing but drop it's price since IPO.

    *THAT* is the problem that Google isn't facing.

  26. Joins? There's your problem right there by Anonymous Coward · · Score: 1

    "The complexity and joins of various database tables must be insane."

    Nah. You simply put one users data in one place (well more than one place for redundancy, but two or 3 not lots of places).
    To build a page you can ask each machine processing that persons data. You ask the machines processing their fiends data for that data, and build the page. Arrange your network so that groups of machines are in subnets, and place the users data based on the connectivity onto machines in the subnet. So more connected users are on the same subnet.

    The idea that you'd chuck everything in some massive database and make everything an SQL query, well that's not a good design.

    Instead of connecting 1 billion people to 1 billion other people, it's really just connect Bob to his 10 friends * 1 billion page serves which is just a scaled up version.

  27. the secret is simple by goombah99 · · Score: 3, Funny

    facebook.pl

    it's just one script in perl.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  28. Zuckerburg fluffers by Anonymous Coward · · Score: 0

    In the 1960's we sent a man to the moon using the caveman's equivalent of today's technology tools. And now people think that Facebook is the top of the complexity pyramid? Sure they have a big scaling issue which nobody says is easy to solve, but to claim that it's the most complex technology problem the world faces today is laughably stupid.

  29. If it keeps breaking doesnt mean ur pushing limits by dell623 · · Score: 1

    It just means it doesn't work well enough.

    Facebook is the worst performing and most opaque large scale site with the worst interface that I use regularly.

    Browsing photos, the most basic Facebook activity is still a pain and buggy as hell on a slowish connection, and they keep changing the damn interface just when you figured out the previous unintuitive change. The mobile website sucks, their Android app sucks, I don't know what the new iOS app is like. The interface has gone from simplicity to being cluttered and horrible with multiple stream throwing information at you.

    If Gmail worked like that I would have quit ages ago. If Amazon worked like that they wouldn't sell shit. Facebook still feels like a damn experiment coded by a few kids in a basement. If Youtube worked like that they would have been replaced long ago as the defacto video hosting site.

  30. The really interesting thing is the number of by dumcob · · Score: 1

    engineers it takes to keep such massive infrastructure up and running. If all it takes today is 2000 people, to manage the data of a billion people, then I really can't see a very __large__ need for software developers in the future.

  31. Not really one billion by kriston · · Score: 2

    It's not really one billion users. As any developer in any online service knows, the real figure is around 30% of the actual reported total. Still, it's no small challenge.

    --

    Kriston

  32. Wrong, long & boring by yusing · · Score: 1

    I would be interested in learning more about the software and hardware side of Facebook. But after 15 seconds of scrolling I hadn't seen any ... just a lot of tedious "gotta do this" journalism ... and gave up. LOOOOOONG BOOOORING

    --

    "You must try to forget all you have learned. You must begin to dream." -- Sherwood Anderson

  33. Dupe? by Anonymous Coward · · Score: 0

    Good grief, editors, anywhere? Can we at least have a non-dupe in between the dupes, or interleave the dupes? Previous story was Why Worms In the Toilet Might Be a Good Idea.

  34. Re:If it keeps breaking doesnt mean ur pushing lim by stridebird · · Score: 1

    Yeah, YouTube really nailed the comment system...

  35. I cannot decide what is worse .. by cheros · · Score: 0

    .. hearing day after day about Facebook or Zuckerberg, seeing Zuckerberg's face in some *cough* "creative" way or hearing him heralded as some business guru (let's just say I disagree).

    I think the face is the worst. I can live with the claims of him being an innovator, I got inured to that after decades worth of Microsoft marketing.

    Hell, I may switch back to a text only browser for my news - speeds things up as well.

    --
    Insert .sig here. Send no money now. Owner may sue, contents will settle. Batteries not included.
  36. Not computer science by Anonymous Coward · · Score: 0

    The article was interesting but has nothing to do with computer science. It was about their development and release process and some of their management and datacenter ideas.

    If you are interested in the real computer science there are writings out there about their database and memcached setup.