Slashdot Mirror


Facebook's Corona: When Hadoop MapReduce Wasn't Enough

Nerval's Lobster writes "Facebook's engineers face a considerable challenge when it comes to managing the tidal wave of data flowing through the company's infrastructure. Its data warehouse, which handles over half a petabyte of information each day, has expanded some 2500x in the past four years — and that growth isn't going to end anytime soon. Until early 2011, those engineers relied on a MapReduce implementation from Apache Hadoop as the foundation of Facebook's data infrastructure. Still, despite Hadoop MapReduce's ability to handle large datasets, Facebook's scheduling framework (in which a large number of task trackers that handle duties assigned by a job tracker) began to reach its limits. So Facebook's engineers went to the whiteboard and designed a new scheduling framework named Corona." Facebook is continuing development on Corona, but they've also open-sourced the version they currently use.

2 of 42 comments (clear)

  1. Re:Junk. by Revotron · · Score: 4, Funny
    Yes, Facebook sure would be a lot more successful if 99.9% of people's posts got deleted and replaced with an on-screen notification that reads,

    This post has been removed because it is of no interest to Anonymous Coward. Please try posting things more in line with the following categories:

    1. Linux
    2. Open-source software
    3. Richard M Stallman
    4. OMG!!! PONIES!!!

  2. Re:Misleading headline by ArcadeMan · · Score: 4, Funny

    And why the fuck should I care about Windows 8 tablets? You are not making any sense!