Slashdot Mirror


Scaling Facebook To 140 Million Users

1sockchuck writes "Facebook now has 140 million users, and in recent weeks has been adding 600,000 new users a day. To keep pace with that growth, the Facebook engineering team has been tweaking its use of memcached, and says it can now handle 200,000 UDP requests per second. Facebook has detailed its refinements to memcached, which it hopes will be included in the official memcached repository. For now, their changes have been released to github."

35 of 178 comments (clear)

  1. Thank goodness by KeithJM · · Score: 5, Funny

    I was losing sleep worrying that people sending me virtual Christmas tree decorations, garden accessories and such would have to wait 3 seconds after they clicked send.

    1. Re:Thank goodness by zappepcs · · Score: 3, Funny

      I know what you mean, but I don't have that trouble much. Using FF with plugins I don't see much advertising at all. Sometimes, when I'm feeling nostalgic, I'll surf using the SeaMonkey browser because I left it default bare. That way I can see all those ads from doubleclick et al if I want to.

      Sad but true, I don't get nostalgic much :-)

    2. Re:Thank goodness by Coopa · · Score: 3, Informative

      I recently had trouble with my copy of Firefox on my home desktop. Even though adblock and filterset updater were installed i wasn't blocking any ads (i've since fixed it).

      I was amazed at how many sites i regularly frequent that are now plastered in ads and horrible to use.

  2. [Unintelligible] Facebook [Unintelligible] by Jonah+Bomber · · Score: 3, Funny

    The only word I understood in this post was "Facebook."

    1. Re:[Unintelligible] Facebook [Unintelligible] by Arthur+B. · · Score: 4, Funny

      You're wrong, that's five word.

      --
      \u262D = \u5350
    2. Re:[Unintelligible] Facebook [Unintelligible] by CannonballHead · · Score: 4, Funny

      If you want to be *quite* technical (and I think it's quite hilarious we're being modded "informative" and "insightful"), the string "140 millions" would be broken into only four words in correct English: One hundred forty millions.

      I presume the "five words" comes from the usual way to say it, one hundred and forty millions, which is technically incorrect as the "and" should refer to the decimal point, as in thirty-two and five one-hundredths.

      I am unsure about the hyphen between one and hundred, though...

  3. Impressive by txoof · · Score: 4, Interesting

    It's pretty impressive that Facebook has been able to grow so quickly and handle so much traffic. Their down time has been pretty insignificant related to the sheer number of requests that blow through their servers every day.

    There's probably a thing or two that can be learned from their developers and IT folks. I just wish I knew more about the whole underlying structure so I could appreciate exactly what they've done.

    --
    This one's tricky. You have to use imaginary numbers, like eleventeen... --Hobbes
    1. Re:Impressive by madhurms · · Score: 3, Interesting

      Here is a presentation which discusses how Facebook handles billions of photos. That should give an idea about how they handle massive load in other areas: http://www.flowgram.com/f/p.html#2qi3k8eicrfgkv

  4. Pretty impressive operation by pintpusher · · Score: 4, Interesting

    at least for me being a 38yo undergrad.

    We had one of their engineers give a talk a couple of weeks ago. The most recent number he had was 120 million members (who've logged on in the last 30 days) and over 65 billion page views per month. And they do it with 200 or so engineers.

    I was fully expecting (being interested primarily in verifiable systems and fp) to be annoyed by this talk, but they have some pretty interesting problems to solve over there. The fact that they're doing it with OSS, and giving back to boot, really made my day.

    --
    man, I feel like mold.
    1. Re:Pretty impressive operation by SatanicPuppy · · Score: 4, Funny

      Yea, but if they could do it with Windows, now that would be a challenge!

      --
      ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
  5. Blaming Linux... by TypoNAM · · Score: 5, Insightful
    Is it just me or does the entire first part of the article scream "Linux is to blame!" when they were discussing about dealing with UDP network overhead issues in their software? For example:

    We discovered that under load on Linux, UDP performance was downright horrible. This is caused by considerable lock contention on the UDP socket lock when transmitting through a single socket from multiple threads. Fixing the kernel by breaking up the lock is not easy. Instead, we used separate UDP sockets for transmitting replies (with one of these reply sockets per thread). With this change, we were able to deploy UDP without compromising performance on the backend.

    I bolded the quote to show what their real problem was. They had a shit load of threads trying to use a single socket and of course there was huge overhead involved due to the mutex lock (Semaphore on kernel side) on a shared resource (the socket). So they blame Linux instead of them selves for such a half-ass implementation of sending out packets from multiple threads with a single socket. They would have gotten the same exact result if they tried it with a single TCP connection socket and attempted to have multiple threads firing off packets with that. If you want multiple threads sending out packets use multiple sockets... Wow what a concept!

    Sorry for my ranting, but it just pisses me off when moron programmers blame the operating system for their own stupidity.

    Anyway, haven't nearly all MMOs gone with using UDP internally of the game cluster network and TCP externally to reduce latency and network overhead? So this is nothing new to me.

    --
    This space is not for rent.
    1. Re:Blaming Linux... by imboboage0 · · Score: 5, Insightful

      No... I don't think they were really blaming Linux. If anything, I'd say they were praising it for having the functionality to be modified to fit their needs. They admitted that the previous configuration they had wasn't ideal, and they fixed it. I think the important part here is that they used Linux to fix it, they continue to use Linux, they documented the fix, and now they are giving back to the OSS community with information on how they did it.

      --
      Honesty may be the best policy, but by process of elimination, dishonesty is the second best policy.
    2. Re:Blaming Linux... by blitzkrieg3 · · Score: 4, Insightful

      They said that "on Linux, UDP performance was downright horrible."

      This statement is just downright disingenuous and wrong. UDP performance in general on Linux is comparable or better than other Operating Systems. What he found out is that accessing a single UDP socket on Linux requires a lock, and that when trying to share that lock over multiple threads you have a performance issue. Welcome to intro level operating systems.

      This has nothing to do with UDP performance, which I define as either throughput or in some cases packets per second. He then goes on to imply that he worked around some issues in Linux, when in actuality he attacked the problem from the wrong angle and through trial and error found the obvious solution. Why would you even think to use the same socket in a connectionless protocol like UDP in the first place?

      I do agree that in general the article was written in more or less praise of Linux, but reading that sentence makes my blood boil.

    3. Re:Blaming Linux... by blitzkrieg3 · · Score: 3, Informative

      2. If you'd read the next sentence right after your bold line, you'd notice they were talking about a kernel lock. Not a lock in memcached. Thats a totally valid reason to blame linux.

      How do you hope to architect a fix for this? Thought I don't know the specifics, they said that they were using the same UDP socket to transmit from multiple threads. That means you have one kernel space data structure across the entire UDP/IP stack being shared by multiple threads. Therefore you need a lock around updates to that data structure.

      Until we see some atomic sendto() operations this is not going to change.

    4. Re:Blaming Linux... by hesaigo999ca · · Score: 4, Insightful

      Too often the people that are left to explain the problem in detail to the press are not the engineers that worked on the solution for that problem. If we had a discussion with one of them, we would hear a totally different story!

    5. Re:Blaming Linux... by ranulf · · Score: 4, Insightful

      [...] So they blame Linux instead of them selves for such a half-ass implementation of sending out packets from multiple threads with a single socket.[...]

      Sorry for my ranting, but it just pisses me off when moron programmers blame the operating system for their own stupidity.

      The point is that it wasn't their own stupidity. They took someone's open source project and improved it so it could better handle high loads. I don't see them blaming Linux, I see them recognising the limitations of the system they are using and coming up with a solution and then sharing it. Normally, this is cause to say "Yay! Open source!" rather than calling them "moron programmers".

    6. Re:Blaming Linux... by Chirs · · Score: 3, Informative

      Mutexes aren't always slow. In the uncontended case they don't require a system call (although they do require an atomic operation which involves some inter-processor signalling).

      Lockless algorithms are generally harder to get right, from what I've seen. It's not just locking the cpus for a cycle, but you also need to worry about using memory barriers (generally written in assembly) to enforce correct visibility across all cpus in the system.

      There are guys on comp.programming.threads that spend a *lot* of time trying to perfect them, and there are often subtle errors that pop up later on. Given the number of problems that regular lock-based algorithms cause, I'd only use lockless if it's absolutely necessary.

  6. Re:I have been wondering for a while... by larry+bagina · · Score: 5, Informative

    Myspace used to run on cold fusion but switched to .NET. facebook runs on LAMP, though they have a customized MySQL and a customized linux kernel with support for the hierarchial page pinning algorithm.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

  7. That is *not* a Facebook problem by RMH101 · · Score: 4, Insightful

    User is sent link, directed to website with malware payload, such as a 0-day IE exploit. User is running unpatched Windows, user is 0wned, PC is 0wned. Hilarities ensue.
    It's just a standard trojan with an unusual delivery method of using fake Facebook profiles run by trojan bots. I can't see how this is Facebook's problem any more than it's your email program's fault that you clicked on a dodgy link without checking it.

  8. Re:... And Yet Very Lacking From a Security Angle by bigstrat2003 · · Score: 4, Insightful

    It can't be addressed... because it's not a security issue with the site. It's an issue that the user needs to be trained on how to spot, and good luck getting that to happen.

    I mean, come on, banks have the "problem" you described, and most banks aren't what we'd call insecure.

    --
    "16MB (fuck off, MiB fascists)" - The Mighty Buzzard
  9. "PHP Doesn't Scale" by 0100010001010011 · · Score: 4, Interesting

    Like or hate social networking. Facebook has gone a long way in showing how well PHP can be made to scale. They also contribute quite a bit back to the PHP project and PHP related projects.

    5 years ago if anyone came along saying they were going to build a website in PHP ./ would be up in arms calling them idiots of all sorts and saying they NEED to go with compiled C or Perl.

    1. Re:"PHP Doesn't Scale" by guruevi · · Score: 3, Interesting

      PHP is good for all types of projects. It's the use of PHP that makes the difference. If you write clear, intelligent and documented code it runs fine. It's even better if you use good function design and definitions. It's plenty fast too and can be pre-compiled or cached. It's also good at scaling because the programmer only has minimal interaction with threading, locking and similar issues and PHP leaves most of it over to the libraries (Apache, IIS, MySQL).

      Programming in PHP is a lot like programming in Java: you have a bad developer and your code will run as slow as hell and will be difficult to maintain. Coding is simple and the optimization is minimal because it's a quite high level language. There are of course a lot of inherited problems in PHP (magic quotes and safe mode to start off with) but with PHP5 and PHP6 they are slowly being phased out. But if you do it well, you can write very secure and fast applications in PHP.

      --
      Custom electronics and digital signage for your business: www.evcircuits.com
  10. Re:... And Yet Very Lacking From a Security Angle by gnick · · Score: 5, Interesting

    Facebook would do well to proactively encourage users to prevent such attacks by securing their systems. For example, by installing this simple application, you can ensure that your computer will never fall victim to malware:
    http://not-malware.i-promise.org/magic-bullet.htm
    Just enable scripts and click OK whenever it tells you to. It's that easy.

    Now, if /. allowed me to post the (fake) link above, how are they any more at fault than facebook is for allowing potentially dodgy links to be shared via their service? They even went the extra step of helping users remove the malware from their PCs. I'd imagine that most conduits for malicious links (IM, social networking, e-mail, online forums, etc) wouldn't have even gone that far. Their users were being targeted and exploited, so they helped them avoid being taken advantage of - Good on 'em.

    Were I malicious, I could grab the e-mail address you share in your title line, look through your /. 'friends' list for other accounts with posted addresses, and e-mail you a malicious link "From" one of them. How would that be different?

    --
    He's getting rather old, but he's a good mouse.
  11. They built a tuple store. by Animats · · Score: 5, Interesting

    Amazon and Google faced similar problems, and dealt with them in ways that are roughly equivalent - by adding a tuple store to their system.

    If the data behind your web site is mostly accessed via one primary key, a tuple store, something that stores name/value pairs, beats a general-purpose relational database. Both Amazon and Google have such a mechanism in their "cloud" systems. Facebook has a somewhat low-rent solution; they're front-ending MySQL with a tuple store cache. This only works if all the queries contain some ID that has to match exactly, like user ID. Effectively, instead of one big database, the problem consists of a large number of tiny databases, all somewhat independent. Problems like that can be scaled up without much trouble.

    Tuple stores distribute nicely - you can spread them over as many machines as you want, just by cutting up the keyspace into conveniently sized shards. There are distributed relational DBMS systems, but they have to be able to do inter-machine joins, which is a hard problem. (That's what you pay the big bucks to Oracle for.)

  12. Re:Wow by madhurms · · Score: 5, Informative

    From hardware perspective, Facebook uses 10,000 web servers and 1800 database servers to handle the massive traffic.

  13. Re:I have been wondering for a while... by duguk · · Score: 4, Informative

    Nope. Facebook has more unique visitors per month, MySpace had approximately 106 millions users as of 8th September 2008, and FTFS, facebook has 140 million (Wikipedia says 120 million.)

  14. Re:Wow by aliquis · · Score: 3, Funny

    "Your business sound more important with VmWare!"

  15. Re:Wow by madhurms · · Score: 5, Informative

    And they also use about 200 memcached servers to speed things up.

    Source: http://frro.net/blog/2008/04/26/just-how-big-is-facebooks-infrastructure/

  16. Re:... And Yet Very Lacking From a Security Angle by jcarkeys · · Score: 3, Interesting

    Actually, they recently created a "go-between" page for all external links, I believe. It repeats what URL is being requested and then has a button that says "go there anyway". The ones that are known viruses are completely blocked.
    That sounds pretty proactive to me

  17. Re:... And Yet Very Lacking From a Security Angle by dubbreak · · Score: 5, Funny

    That link is dead. Could repost a working link?

    I really need that application. I get so many viruses.

    --
    "If you are going through hell, keep going." - Winston Churchill
  18. Yes, by internerdj · · Score: 3, Insightful

    if by validation you mean:
    Being able to find old friends you haven't been able to contact in years.
    Having a central pull information spot rather than the push model of spaming every email address you have with pics of the new baby, house, car, toaster.
    A central and standardized organization spot for arranging informal gatherings with friends, like parties.

    1. Re:Yes, by TheTyrannyOfForcedRe · · Score: 3, Funny

      What are these "parties" you speak of?

      --
      "Liechtenstein is the world's largest producer of sausage casings, potassium storage units, and false teeth."
  19. Re:Wow by Anonymous Coward · · Score: 3, Funny

    From hardware perspective, Facebook uses 10,000 web servers and 1800 database servers to handle the massive traffic.

    That's funny because the Russian Business Network uses a 250,000 strong zombie botnet to create the Facebook accounts and massive traffic...

  20. Upgrading.... by supernova_hq · · Score: 3, Funny

    Our chance to slashdot facebook is diminishing as we speak!

  21. Re:Wow by Baton+Rogue · · Score: 3, Insightful

    What they know about you can fill a warehouse.

    What they know about you is only what you tell them.