Facebook's Prism, Soon To Be Open Sourced, Gives Hadoop Delay Tolerance

← Back to Stories (view on slashdot.org)

Facebook's Prism, Soon To Be Open Sourced, Gives Hadoop Delay Tolerance

Posted by timothy on Saturday November 3, 2012 @10:22AM from the and-eventually-smoke-signals dept.

snydeq writes "Facebook has said that it will soon open source Prism, an internal project that supports geographically distributed Hadoop data stores, thereby removing the limits on Hadoop's capacity to crunch data. 'The problem is that Hadoop must confine data to one physical data center location. Although Hadoop is a batch processing system, it's tightly coupled, and it will not tolerate more than a few milliseconds delay among servers in a Hadoop cluster. With Prism, a logical abstraction layer is added so that a Hadoop cluster can run across multiple data centers, effectively removing limits on capacity.'"

17 comments

Min score:

Reason:

Sort:

This changes everything by Anonymous Coward · 2012-11-03 10:30 · Score: 1

...but when?
1. Re:This changes everything by interval1066 · 2012-11-03 10:36 · Score: 4, Insightful
  
  Actually, its pretty cool. Its a solution to a problem that needed a solution, for once. Quite frankly, even though I'm not an army of PhD C-Sci scientists, I'm sorry I couldn't have come up with it. Its weird little problems like this with their solutions that win the "cool" race. Or the "king of geeks" race, or whatever you want to call the brainaic metric.
  
  --
  Python: 'And then suddenly you have a language which says "we're all stuck with whatever the whiniest coder wants".'
2. Re:This changes everything by nurb432 · 2012-11-03 10:42 · Score: 1
  
  soon
  
  --
  ---- Booth was a patriot ----
3. Re:This changes everything by Anonymous Coward · 2012-11-03 10:57 · Score: 4, Funny
  
  Sounds like it shouldn't be hard to...Hadooplicate?
4. Re:This changes everything by Anonymous Coward · 2012-11-03 11:20 · Score: 0
  
  So cheesy, but so funny.
I think it's great by koan · 2012-11-03 14:34 · Score: 1

Love to see useful stuff open sourced, but part of me is annoyed it is Facebook doing it.

--
"If any question why we died, Tell them because our fathers lied."
1. Re:I think it's great by Anonymous Coward · 2012-11-03 15:04 · Score: 0
  
  Oh don't worry, knowing Facebook, using their software will probably Hadooplicate your data and make it all public.
  And then monetize it.
  And then corrupt it.
  And then lose it.
2. Re:I think it's great by Anonymous Coward · 2012-11-03 17:06 · Score: 0
  
  Glad your understanding of how open source works is so lacking. You and the grandparent poster shouldn't stop your nerd rage though. Keep up the virginity.
3. Re:I think it's great by Anonymous Coward · 2012-11-04 05:53 · Score: 0
  
  As much as I hate Facebook, they really do have some great engineers and scientists working behind the scenes.
4. Re:I think it's great by Anonymous Coward · 2012-11-04 06:38 · Score: 0
  
  Glad your understanding of how satire works is so lacking.
What? by fa2k · 2012-11-04 01:36 · Score: 1

You typically have O(ms) seek latency for hard drives, does this mean that Facebook had all data in RAM before they made Prism?
1. Re:What? by garaged · 2012-11-04 14:25 · Score: 1
  
  Google does that, I wouldn't be surprised that facebook does it too
  
  --
  I'm positive, don't belive me look at my karma
2. Re:What? by haruchai · 2012-11-04 16:06 · Score: 1
  
  Altavista was doing this way back. When the typical Windows desktop was 16 - 32 MB RAM, they have a RAM cache of up to 64GB.
  
  --
  Pain is merely failure leaving the body
Facebook Prison by SirAdelaide · 2012-11-04 15:25 · Score: 1

I misread the title, was disappointed until I saw the word Hadoop. It's such a a silly name.

--
I'm a fruit pirate. I bought a watermelon once, and spat the seeds in the back yard. They grew into another watermelon,
(Relatively) lay explanation of bottleneck? by DriedClexler · 2012-11-04 16:29 · Score: 1

What is the sub-problem when running a Hadoop job that has this bottleneck and requires such low latency? Is it something that could have been avoided for a start?
And how does (or if, predictably, the media reports don't explain it, *would*) a logical abstraction layer solve this problem such that Hadoop's programmers couldn't have more easily done it within the application's own code?

--
Information theory is life. The rest is just the KL divergence.