Scaling Facebook To 140 Million Users
1sockchuck writes "Facebook now has 140 million users, and in recent weeks has been adding 600,000 new users a day. To keep pace with that growth, the Facebook engineering team has been tweaking its use of memcached, and says it can now handle 200,000 UDP requests per second. Facebook has detailed its refinements to memcached, which it hopes will be included in the official memcached repository. For now, their changes have been released to github."
I was losing sleep worrying that people sending me virtual Christmas tree decorations, garden accessories and such would have to wait 3 seconds after they clicked send.
The only word I understood in this post was "Facebook."
140 million people that need to get a life.
140 million users? Wow... I can barely imagine the hardware to handle this
Religion: The greatest weapon of mass destruction of all time
It's pretty impressive that Facebook has been able to grow so quickly and handle so much traffic. Their down time has been pretty insignificant related to the sheer number of requests that blow through their servers every day.
There's probably a thing or two that can be learned from their developers and IT folks. I just wish I knew more about the whole underlying structure so I could appreciate exactly what they've done.
This one's tricky. You have to use imaginary numbers, like eleventeen... --Hobbes
...I thought I should make a Christmas carol about what we see on the net everyday.
Smashing through the door, comes Firefox three browsing sites we go laughing at IE all the way ha ha ha!
Steve Ballmer yells on youtube, making children cry. Oh what fun it is to see that stupid Windows guy. Hey!
Jingle bells Digg smells Slashdot all the way! Oh what fun it is to post on facebook every day, yay!
"The difference between genius and stupidity is that genius has it's limits" - Albert Einstein
What's the hardware behind Facebook / Myspace? I mean, they can't be run on average servers... (disclaimer: I don't really know anything about high-end web hosting)
Facebook is... barely acceptible for me to use, Myspace just plain sucks.
at least for me being a 38yo undergrad.
We had one of their engineers give a talk a couple of weeks ago. The most recent number he had was 120 million members (who've logged on in the last 30 days) and over 65 billion page views per month. And they do it with 200 or so engineers.
I was fully expecting (being interested primarily in verifiable systems and fp) to be annoyed by this talk, but they have some pretty interesting problems to solve over there. The fact that they're doing it with OSS, and giving back to boot, really made my day.
man, I feel like mold.
We discovered that under load on Linux, UDP performance was downright horrible. This is caused by considerable lock contention on the UDP socket lock when transmitting through a single socket from multiple threads. Fixing the kernel by breaking up the lock is not easy. Instead, we used separate UDP sockets for transmitting replies (with one of these reply sockets per thread). With this change, we were able to deploy UDP without compromising performance on the backend.
I bolded the quote to show what their real problem was. They had a shit load of threads trying to use a single socket and of course there was huge overhead involved due to the mutex lock (Semaphore on kernel side) on a shared resource (the socket). So they blame Linux instead of them selves for such a half-ass implementation of sending out packets from multiple threads with a single socket. They would have gotten the same exact result if they tried it with a single TCP connection socket and attempted to have multiple threads firing off packets with that. If you want multiple threads sending out packets use multiple sockets... Wow what a concept!
Sorry for my ranting, but it just pisses me off when moron programmers blame the operating system for their own stupidity.
Anyway, haven't nearly all MMOs gone with using UDP internally of the game cluster network and TCP externally to reduce latency and network overhead? So this is nothing new to me.
This space is not for rent.
Why not just multiplex memcached requests on single connection at web host level?
It's pretty impressive that Facebook has been able to grow so quickly and handle so much traffic. Their down time has been pretty insignificant related to the sheer number of requests that blow through their servers every day.
There's probably a thing or two that can be learned from their developers and IT folks. I just wish I knew more about the whole underlying structure so I could appreciate exactly what they've done.
Well, call me cynical but the things that interest me about Facebook are what has gone wrong. Like hackers selling account details for pennies. This is the end result:
The scam works by a victim clicking on a spam link that appears to be coming from one of their Facebook friends or someone in their address book which lodges spyware in their machine. This then records all the information, including passwords, when they log in to various sites.
The passwords can then be sent on to money-laundering gangs who use them to infiltrate users' bank accounts.
While this is true of any other networking site, I think this severe security issue needs to be address successfully one of these days.
... it's only a matter of time.
All I've seen Facebook do to remedy this is explain how to clean it off your computer.
I fear for the millions of homes where a kid logs onto Facebook, gets mail from Timmy. Clicks the link, finds nothing and leave. Mom and dad log into their online banking/credit card statement later that night and
My work here is dung.
I use a false name, and I don't post anything that can easily identify me. If I want a friend to associate with me, I let them know what to look for.
Now I get the mundane details of everyone's life, such as "Getting a haircut, yea!" on the rare occasions I check it. At least people can't bug me to be on it anymore.
I went to high school with the guy who wrote that post at facebook!
User is sent link, directed to website with malware payload, such as a 0-day IE exploit. User is running unpatched Windows, user is 0wned, PC is 0wned. Hilarities ensue.
It's just a standard trojan with an unusual delivery method of using fake Facebook profiles run by trojan bots. I can't see how this is Facebook's problem any more than it's your email program's fault that you clicked on a dodgy link without checking it.
There are some people from work who I added as friends, before I knew them really well. Now I get all their exciting updates like "So and So just joined the group 'Whereever you go, there is a Jew' or 'Jews are the nicest people'". This person is really nice at work, but I'd really like to sever this facebook relationship. Not because they are Jewish, mind you, but because they wear their religion on their sleeve, have some strong religious views (they could easily be Hindu, Muslim, Christina and I'd think the same) and I don't like it. The problem is, I can't figure out how to either a) turn off their notifications or b) defriend them without causing an issue in the office.
This post brought to you by your friendly neighborhood MBA.
Yes, you can delete your account... not sure if Facebook purges the data from their servers, but it shouldn't be accessible to anyone else after you delete your profile.
You can also set it so that only certain groups of people (or no one at all) can see your profile, customizable on an item-by-item basis (including various things like phone, address, profile picture, status, birthday, birth year, friends list, bio, wall posts, videos, pictures) and/or comment on your wall, pictures/videos, or send you messages.
You can also tell it not to let search engines like Google find your profile, which I'd also recommend.
Actually, if you really want to play with it, I'd recommend that you register under a fake name and fool around with the security settings. If you're satisfied that it's private enough for your tastes you can put your real name and info up.
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
I generally like looking through my friends' new pictures and sometimes their notes (if the note shows up in the feed and looks interesting).
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
Like or hate social networking. Facebook has gone a long way in showing how well PHP can be made to scale. They also contribute quite a bit back to the PHP project and PHP related projects.
5 years ago if anyone came along saying they were going to build a website in PHP ./ would be up in arms calling them idiots of all sorts and saying they NEED to go with compiled C or Perl.
"Relationship"? You're talking about "Facebook friends" here. The vast majority of Facebook users probably don't even know half the people on their so-called "Friends list", much less consider them to be friends.
In reality, "facebook friends" doesn't mean much more than "people I probably wouldn't kill if I met them randomly on the street."
Get over it. If you don't want to delete them, then just ignore them... the same as you'd ignore a notification that said "Timmy just ate a booger".
Amazon and Google faced similar problems, and dealt with them in ways that are roughly equivalent - by adding a tuple store to their system.
If the data behind your web site is mostly accessed via one primary key, a tuple store, something that stores name/value pairs, beats a general-purpose relational database. Both Amazon and Google have such a mechanism in their "cloud" systems. Facebook has a somewhat low-rent solution; they're front-ending MySQL with a tuple store cache. This only works if all the queries contain some ID that has to match exactly, like user ID. Effectively, instead of one big database, the problem consists of a large number of tiny databases, all somewhat independent. Problems like that can be scaled up without much trouble.
Tuple stores distribute nicely - you can spread them over as many machines as you want, just by cutting up the keyspace into conveniently sized shards. There are distributed relational DBMS systems, but they have to be able to do inter-machine joins, which is a hard problem. (That's what you pay the big bucks to Oracle for.)
How does Facebook make money?
I've noticed a general slow down and unresponsiveness in facebook. It started when they rolled out the new fully ajaxified UI a few months back.
I figured the slow down was caused by the ajax but maybe it was the 600,000 new users getting added per day.
I hope facebook speeds up.
Doesn't take into account the 40 accounts I have. One for each time I get tired of having too many friends and not enough inclination to actually delete them all. Create, fill, overflow, start over.
flinging poop since 1969
The question is: as Netcraft counts MySpace accounts as 'Web sites' in its figures, will it now count these 140 million accounts as 'Web sites' also?
If not, whenever you look at Netcraft's figures, don't forget to add 140 million Apache sites to them (not to mention minusing all those GoDaddy parked domains from IIS).
140 million people need validation from a web page...
Browsing at +1 - no ACs, I ignore their posts. So refreshing!
if by validation you mean:
Being able to find old friends you haven't been able to contact in years.
Having a central pull information spot rather than the push model of spaming every email address you have with pics of the new baby, house, car, toaster.
A central and standardized organization spot for arranging informal gatherings with friends, like parties.
And 150 million of those users are bots.
Either that or facebook has tonnes of supermodels that have only two or three friends. ...not that I've been searching ;)
Wow, I'm surprised people still use Facebook. I cancelled my account... to many stupid application invites. Eventually it will be myspace, filled with the same people who do the Rogers Wireless commercials... The same people I target in parking lots when I decide I want to hit something.
when their little messages or updates show up, there's a little "Options" link that pops up when you mouseover the item. Click on that, and there are "More about John" "Less about John" links. Click the Less link. Do it anytime you see something about them. It should only take one or two clicks, and you'll never see anything about them ever again, even though they're still in your Friend list.
Our chance to slashdot facebook is diminishing as we speak!
From the article by Paul Saab:
"We discovered that under load on Linux, UDP performance was downright horrible. This is caused by considerable lock contention on the UDP socket lock when transmitting through a single socket from multiple threads. Fixing the kernel by breaking up the lock is not easy. Instead, we used separate UDP sockets for transmitting replies (with one of these reply sockets per thread). With this change, we were able to deploy UDP without compromising performance on the backend..."
He mentions at least 3 other problems which (to anyone wanting to get the job done well) read as "Linux is not the best OS for this job!", but they're still struggling with Linux and trying to hack up some kind of ad hoc solution. Why not just use FreeBSD instead?
No, this is not flamebait, I'm being serious.
I don't know the details of the 3 other problems, but using separate udp sockets for replies to break up low level contention is straight out of unp - off the top of my head I've used it on at least 2 projects on solaris. Doesn't sound like a linux problem to my way of thinking.
Count me out. That social networking crap is a load of BS. People do NOT need to be in touch that often and that deeply. Learn some privacy folks.
The traffic levels aren't even close.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
Ah, I noticed that option and have tried it once or twice before. I will click it a few more times. Thanks for the advice!
This post brought to you by your friendly neighborhood MBA.
Simple solution is don't tell facebook ...and then Facebook will not know
If your phone number, address, work history, educational history is on facebook then you are foolish, your friends and family already know this information (or don't care), and facebook does not need to know
The security on facebook should be assumed to be flawed, since it is unlikely to be perfect and so you should not put any more information on than you would be willing to let everyone on facebook see ...
Puteulanus fenestra mortis
But did they test any other OS udp performance under load? Not being a fanboy...
Could very well be that that would be horrible as well, unless it was designed from the start for a high load, which would probably bring you to solaris or so.