Where Facebook Stores 900 Million New Photos Per Day
1sockchuck writes: Facebook faces unique storage challenges. Its users upload 900 million new images daily, most of which are only viewed for a couple of days. The social network has built specialized cold storage facilities to manage these rarely-accessed photos. Data Center Frontier goes inside this facility, providing a closer look at Facebook's newest strategy: Using thousands of Blu-Ray disks to store images, complete with a robotic retrieval system (see video demo). Others are interested as well. Sony recently acquired a Blu-Ray storage startup founded by Open Compute chairman Frank Frankovsky, which hopes to drive enterprise adoption of optical data storage.
They could just delete most of the photos after they age a bit, analyzing it with some of their AI whiz-bang software.
If anyone ever asks to see the image again, they can just show one that is "close enough" and nobody would ever know the difference.
I personally, have never posted a photo to Facebook, so I'd be OK with that.
This issue is a bit more complicated than you think.
Datageddon.
After 3 months of no views, just replace them with a goatse image.
That way, you only need to store one image which replaces 99.999% of all pics uploaded. No need for complex storage solutions!
Another advantage would be that you can serve it really, really fast. No wait time!
...gis sdrawkcab (usually not responding to ACs; don't bother posting as AC)
What happens when a user wants to delete an image permanently. If it's stored on an optical disc are they going to destroy the whole disc and burn it again?
- Dan
Should have read:
You won't believe this one weird trick Facebook uses to store data!
Other than that, fascinating look at how all that data is being stored and retrieved.
So sony are really paying for a piece of facebook influence.
Are we talking Mr. Freeze (Batman nemesis) cold, Hell-freezes-over cold, inter-stellar-space-cold, atoms-trapped-into-near-motionlessness-by-laser-beams-cold, or CowboyNeal's-super-chilled-CPU-cold?
Just asking.
is that their monthly AWS fees must be ENORMOUS!
is it cold in here or what?
I've noticed large latency for rarely used pictures in FB for over eight months now, and by large latency I mean visit the page, then come back the next day to see the next batch of > 5 year old pictures and wait another day for the final batch of ~10 years ago pictures.
Let start browsing old photos at once to drive their bluray system crazy! :D
I went to my bank once about 15 years ago to ask a question about my account. The teller turned around and pulled what looked like a shoebox off of a shelf and leafed through it to find the information.
If shoeboxes are good enough for banks, they should be good enough for Facebook.
FTA:
Why wouldn't they need to spin them up? Is there some technology that can read a Blu-Ray without spinning it?
People upload the same memes all the time. Just hash and store the common images and you'll reduce the unique photos to one or two unique images per day. :)
I wonder if Facebook does something similar to what Google has done with its picture service. I know Google has said, they slightly reduce the quality to save space. Some photographers have come out protesting Google's defaulting this reduction rather then allowing the user to decide quality. You obviously can opt in to not have pictures compressed. Facebook to me is not anymore trustworthy with personal property then Google. Storage is so cheap to buy to carry around you can store plenty of pictures on small drives that fit in a pocket. I get Facebook is a social site and easy access is probably a point to make here with using Facebook. I just wonder as users get more and more drawn into Facebook services. When Facebook will begin to add fee's to these services?
An interesting question is at what point does it become viable for FB to follow Amazon's model to scale its own system as a business unit...
As in, when will FB conclude that it again needs to widen its revenue stream portfolio, and it therefore makes sense to offer its own version of AWS?
Any predictions on FBWS?
And there's the FB hardware development division, a business unit that so far has also remained in-house but has its own revenue potential. I think people tend to underestimate MZ's ambitions to leverage the FB core to create a broad spectrum business. (following Google's leverage of search revenue to devour the advertising business, etc., and Amazon's leveraging of book sales to devour retailing and then logistics, etc.)
Pretending this is my office full of bitter coworkers..
I can pretty much guarantee they already do that.
The NSA won't be happy about this... they need instantaneous access to every cat photo ever posted, so they can mine it for "intelligence" to save us from the terrists. A search algorithm that has to pull data from DVD's moved around by robots is really going to hinder this!
Wow, they discovered HSM only 40 years after it was introduced. Amazing.
Is facebook still a thing? People still use it after all the security problems and personal information screw-ups?
Just cruising through this digital world at 33 1/3 rpm...
And one well-crafted discovery motion ... bankrupts Facebook.
Sounds good!
How is using blu ray cheaper than hard drives? Not only is it slower, but the medium + the hardware to burn them + the robotic retrieval system..
Seems like there could be an easier solution to this: hard drives in racks. No robots, no optical drives, and no blu ray discs.
One 500gb hard drive already has 10x the amount of storage as a dual layer bluray. In fact, a 10 pack of dual layer blu ray discs on amazon costs twice as much as a 500gb 3.5" drive. Am I missing something?
Store them in /dev/null and do the world a favour.
In my ass.
First thing I did was open facebook and look to see what my oldest picture was. I don't have that many and it came up pretty quickly but I'm sure lots of other people had the same impulse.
Replace images of people's food with a stock image, and they could dispense with this whole system.
Didn't we see a story about this last year?
It might be that using Blu-Ray autochangers may be a very useful thing to have, especially for something that can fill the gap between HDDs and LTO tapes for backups [1].
The pathetic thing is that this technology isn't new. We used to have 100, 200, even 400 disk CD and DVD carousels. By replacing the CD reader with a burner, and using 128 GB BDXL media, that means tens of terabytes of tamper-resistant (important with all the ransomware out there) WORM storage.
The trick is getting BD media into the terabytes and getting it at a price point where it is decently affordable. For example, a 100 GB BDXL disk is $65, but it should be about 10% of that price in order to be a viable backup medium.
[1]: The cloud isn't an option in a number of cases (WAN bandwidth isn't cheap), and it is only a matter of time before a major provider gets hacked.
In the cloud, obvs.
systemd is Roko's Basilisk.
it would have answered your questions, and you wouldn't have looked like a tool, and i wouldn't have mocked you. the world would have been a better place! if only.
The nsa built that huge data center in Utah for nothing?
Now if the nsa would just open an api to retrieve it....
You've only proved that you both can be sarcastic mother fuckers (to use your words from below).
So, being somewhat archived, are they secure from Web crawling? Can some hacker guess a web address and browse the collection?
The dyes in DVDs and BluRays that can be written to by consumer grade equipment degrade over time to the point where in a decade or two, it might be impossible to read that DVD or BluRay. The M-Disc might be the solution to that but the story doesn't talk about that...
"I just reply to you when I see you spamming Slashdot with your nonsense"- by dave420 (699308) on Friday June 19, 2015 @10:31AM (#49945047)
Why'd you agree w/ my points on hosts then? Quoting you:
"I'm not denying all those things" - by dave420 (699308) on Wednesday September 17, 2014 @11:39AM (#47927435) FROM -> http://yro.slashdot.org/commen...
Of course not: It's impossible to dispute HOSTS FILES superiority to other methods!
Since my points in favor of hosts SINGLE FILE native kernelmode faster part show hosts doing more w/ less vs. so-called 'competitors' many part messagepassing + cpu/ram use overheads laden slower usermode FAR MORE COMPLEX 'solutions' doing less than hosts do for more security, speed, reliability, + anonymity!
I make creating a superior more efficient solution EASIER!
(That's more than a mere trolling stalking harassing "ne'er-do-well" like yourself could *EVER* manage).
---
"I'm simply pointing out that it takes an AdBlocker to block your spamming"- by dave420 (699308) on Friday June 19, 2015 @10:31AM (#49945047)
I bother you? Then WHY DON'T YOU DO IT & use 'em? Answer that!
(You stalk/harass me instead!)
OBVIOUSLY you don't & you're a "ne'er-do-well" troll & you have "other motivations" (next):
---
* QUESTION:
DO YOU WORK FOR AN ADVERTISING FIRM, or ARE YOU A WEBMASTER/WEBCODER http://slashdot.org/comments.p... , or a MALWARE MAKER, or ARE YOU AFFILIATED WITH 1 OF MY COMPETITORS?
Answer it!
As per your usual you'll avoid every question, or lie & You've been EXPOSED in your "motives" in the last link just above, lol!
APK
P.S.=> See Dave420 the "pot puffing clown" SQUIRM - evasions galore will ensue (as well as effete downmods via sockpuppets to *try* vainly "hide it" -> http://slashdot.org/comments.p... )... apk
"I just reply to you when I see you spamming Slashdot with your nonsense"- by dave420 (699308) on Friday June 19, 2015 @10:31AM (#49945047)
Why'd you agree w/ my points on hosts then? Quoting you:
"I'm not denying all those things" - by dave420 (699308) on Wednesday September 17, 2014 @11:39AM (#47927435) FROM -> http://yro.slashdot.org/commen...
Of course not: It's impossible to dispute HOSTS FILES superiority to other methods!
Since my points in favor of hosts SINGLE FILE native kernelmode faster part show hosts doing more w/ less vs. so-called 'competitors' many part messagepassing + cpu/ram use overheads laden slower usermode FAR MORE COMPLEX 'solutions' doing less than hosts do for more security, speed, reliability, + anonymity!
I make creating a superior more efficient solution EASIER!
(That's more than a mere trolling stalking harassing "ne'er-do-well" like yourself could *EVER* manage).
---
"I'm simply pointing out that it takes an AdBlocker to block your spamming"- by dave420 (699308) on Friday June 19, 2015 @10:31AM (#49945047)
I bother you? Then WHY DON'T YOU DO IT & use 'em? Answer that!
(You stalk/harass me instead!)
OBVIOUSLY you don't & you're a "ne'er-do-well" troll & you have "other motivations" (next):
---
* QUESTION:
DO YOU WORK FOR AN ADVERTISING FIRM, or ARE YOU A WEBMASTER/WEBCODER http://slashdot.org/comments.p... , or a MALWARE MAKER, or ARE YOU AFFILIATED WITH 1 OF MY COMPETITORS?
Answer it!
As per your usual you'll avoid every question, or lie & You've been EXPOSED in your "motives" in the last link just above, lol!
APK
P.S.=> See Dave420 the "pot puffing clown" SQUIRM - evasions galore will ensue (as well as effete downmods via sockpuppets to *try* vainly "hide it" -> http://slashdot.org/comments.p... )... apk