The Computer Science Behind Facebook's 1 Billion Users
pacopico writes "Much has been made about Facebook hitting 1 billion users. But Businessweek has the inside story detailing how the site actually copes with this many people and the software Facebook has invented that pushes the limits of computer science. The story quotes database guru Mike Stonebraker saying, 'I think Facebook has the hardest information technology problem on the planet.' To keep Facebooking moving fast, Mark Zuckerberg apparently instituted a program called Boot Camp in which engineers spend six-weeks learning every bit of Facebook's code."
The story quotes database guru Mike Stonebraker saying, 'I think Facebook has the hardest information technology problem on the planet.'
Really? You think keeping track of some people's dinner plans is the hardest IT problem on the planet? How about YouTube storing and serving truly ludicrous amounts of video. Web search? Watson?
Facebook is utterly trivial compared to many problems out there.
I totally believe that Facebook has 1 billion users... because I am 4 of them.
...is looking for meaningful computer science discussion in a business magazine article.
There's no -1 for "I don't get it."
The print version is available.
I don't recommend reading it. There is absolutely nothing in this article about the actual engineering problems behind scaling for this number of users and how these problems are solved. In fact, there is nothing technical at all in this article except for some vague descriptions of the "bootcamp".
"What lies behind us, and what lies before us are tiny matters compared to what lies within us." Ralph Waldo Emerson
It's actually a rather impressive setup. Some Facebook architects gave a talk in EE380 at Stanford a few years back. Originally, Facebook's architecture assumed that most "friends" would be regionally local, reflecting Facebook's college-campus origin. That's not how it worked out after some growth. So they have to assemble pages across regions and data centers. There's caching, but there's also active cache invalidation, which they can do because they control both sides of the cache. There's extensive inter-process communication, and it's not HTTP. There's a lot of PHP for the user-facing stuff, but it's compiled with their in-house compiler, not interpreted.
Facebook's purpose is banal, but the technology behind it is non-trivial.
PHP has proven to be the best web development kit. It's only persistent failure is the legacy growth of inconsistent api calls. For the rest, it's turing complete, does scale well, and most of all is the best tuned hammer for the job. It delivers.
In effect, PHP is a huge C api with its own C like language constructs, a layer of abstraction which takes away the mundane and gets you building web sites.
Now C is hailed for its great power, and not made fun of because of its ability to make real crappy, insecure code.
PHP however is not hailed for its great power, and made fun of because of its ability to make real crappy, insecure code.
It's all a matter of perspective. The problem is low level programmers who can't live with the fact people make a billion dollar without obsessing over pointers or garbage collection.