Slashdot Mirror


Where Facebook Stores 900 Million New Photos Per Day

1sockchuck writes: Facebook faces unique storage challenges. Its users upload 900 million new images daily, most of which are only viewed for a couple of days. The social network has built specialized cold storage facilities to manage these rarely-accessed photos. Data Center Frontier goes inside this facility, providing a closer look at Facebook's newest strategy: Using thousands of Blu-Ray disks to store images, complete with a robotic retrieval system (see video demo). Others are interested as well. Sony recently acquired a Blu-Ray storage startup founded by Open Compute chairman Frank Frankovsky, which hopes to drive enterprise adoption of optical data storage.

63 of 121 comments (clear)

  1. They could save space by NotQuiteReal · · Score: 3, Funny

    They could just delete most of the photos after they age a bit, analyzing it with some of their AI whiz-bang software.

    If anyone ever asks to see the image again, they can just show one that is "close enough" and nobody would ever know the difference.

    I personally, have never posted a photo to Facebook, so I'd be OK with that.

    --
    This issue is a bit more complicated than you think.
    1. Re:They could save space by Anonymous Coward · · Score: 1

      It's a bit like memory that way. You have some short term memory, and those become long term memory, which you can never really recall exactly. In some ways, this might be the solution to the "problem" of the internet never forgetting.

    2. Re:They could save space by QuietLagoon · · Score: 5, Interesting

      They could just delete most of the photos after they age a bit, analyzing it with some of their AI whiz-bang software....

      More than a few of my [real world] friends use facebook as their archive for photos, eschewing local or cloud-based storage for their historical family photos. They would be unhappy if facebook were to randomly start deleting photos just because they've been on facebook for a period of time.

      .
      Of course, I've told those friends that facebook may not have the same photo-preservation goals as they do, but they seem to be unconcerned.

    3. Re:They could save space by Daniel+Hoffmann · · Score: 1

      Yeah computers have a system like that, it is called memory.

    4. Re:They could save space by Anonymous Coward · · Score: 1

      More than a few of my [real world] friends use facebook as their archive for photos, eschewing local or cloud-based storage for their historical family photos.

      . If they are storing their photos on facebook, they ARE storing them in the cloud.

      Of course, I've told those friends that facebook may not have the same photo-preservation goals as they do, but they seem to be unconcerned.

      Why would they be concerned? Ignorance is bliss.

    5. Re:They could save space by QuietLagoon · · Score: 2

      .... If they are storing their photos on facebook, they ARE storing them in the cloud....

      In a general sense, correct.

      .
      However, when I said "cloud-based storage" I meant the cloud service was a storage service, not a social media service. If I had meant facebook, I would have said cloud-based social media service.

    6. Re:They could save space by lgw · · Score: 1

      Facebook seems to have your friends in mind, at least for now. They have a system where old photos are store quite cheaply, because they simply fail to display the first time you try to view them. By giving up on storing them in a way that can serve a web page hit, Facebook can be quite cheap (though I hear they use powered-down HDDs, not optical - and Western Digital has a new line of HDDs just for this purpose).

      --
      Socialism: a lie told by totalitarians and believed by fools.
    7. Re:They could save space by bondsbw · · Score: 1

      If they are storing their photos on facebook, they ARE storing them in the cloud.

      It depends on what you mean "store". Dictionary.com provides this as a definition: "to accumulate or put away, for future use". (emphasis mine)

      I don't think Facebook guarantees future retrieval, so it is probably not proper to classify it as storage.

      --
      All my liberal friends think I'm a conservative, all my conservative friends think I'm a liberal.
    8. Re:They could save space by ncc74656 · · Score: 1

      If they are storing their photos on facebook, they are doing it wrong.

      FTFY. I can kinda understand posting stuff to Farcebook so others can view it, but using it as your primary storage medium? That's at least a dozen different kinds of wrong.

      --
      20 January 2017: the End of an Error.
    9. Re:They could save space by drinkypoo · · Score: 1

      More than a few of my [real world] friends use facebook as their archive for photos

      hahahahahahahahahahaha

      Of course, I've told those friends that facebook may not have the same photo-preservation goals as they do, but they seem to be unconcerned.

      So what makes you think they would be unhappy if facebook started deleting their photos? Apparently they don't care :p

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    10. Re:They could save space by davester666 · · Score: 2

      Sure: "Show me the picture of me and my wife on the beach 10 years ago"

      Wife: "Who the hell is that in the picture with you? And when did that happen?"

      --
      Sleep your way to a whiter smile...date a dentist!
    11. Re:They could save space by KGIII · · Score: 1

      Long time passing...

      --
      "So long and thanks for all the fish."
    12. Re:They could save space by cthulhu11 · · Score: 1

      Facebook practices severe compression on uploaded photos, so it's not a great option for archival. I upload family photos there for my wife's friends/family to see, but use CrashPlan for actual archival. These days most people view photos as ephemera, given that so many people can take thousands of good-enough-for-casual-use photos with their phones. Back in the day when taking / printing a photo was an event, people valued them much more.

  2. Replace them by war4peace · · Score: 1

    After 3 months of no views, just replace them with a goatse image.
    That way, you only need to store one image which replaces 99.999% of all pics uploaded. No need for complex storage solutions!
    Another advantage would be that you can serve it really, really fast. No wait time!

    --
    ...gis sdrawkcab (usually not responding to ACs; don't bother posting as AC)
    1. Re:Replace them by Irate+Engineer · · Score: 4, Funny

      After 3 months of no views, just replace them with a goatse image.

      Dear God, there is more than one!?!

      --

      Left MS Windows for Linux Mint and never looked back!

      Vote for Bernie in 2016!

    2. Re:Replace them by KiloByte · · Score: 1

      1. Besides hello.jpg, there's giver.jpg.
      2. The goatse image (hello.jpg) comes from a set of 40 photos.

      You can find more in the proper encyclopedia.

      --
      The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
  3. Delete? by DanJ_UK · · Score: 2

    What happens when a user wants to delete an image permanently. If it's stored on an optical disc are they going to destroy the whole disc and burn it again?

    --
    - Dan
    1. Re:Delete? by bill_mcgonigle · · Score: 5, Insightful

      What happens when a user wants to delete an image permanently.

      What gave you the idea that's a service Facebook offers?

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    2. Re:Delete? by NatasRevol · · Score: 3, Informative

      I see you haven't read Facebook's terms of service.

      There is no delete.

      --
      There are two types of people in the world: Those who crave closure
    3. Re:Delete? by blueshift_1 · · Score: 1

      The ultimate truth of the internet... there is no delete. Just some things are a bit harder to find.

    4. Re:Delete? by DanJ_UK · · Score: 1

      Facebook's policy is to delete all user data including photos after 3 months of an account being deleted.

      There was an uproar about not being able to trace a user account just two days ago regarding a revenge porn case in Holland.

      Now, how are they going to physically remove data from a cold storage solution? I highly doubt they'll be using R/W discs as removing the data would require wiping the disc and rewriting 50gb of data again.

      --
      - Dan
    5. Re:Delete? by MightyYar · · Score: 1

      So long as there are no links to the image, it is effectively "deleted". Same as magnetic storage. You just null the index, you don't actually go back and wipe the data back to zeros. Technically the offending bits still reside on the disk, but it's close enough if there is no way to access the data short of using forensic tools.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    6. Re: Delete? by devilspgd · · Score: 2

      Another implementation would be to encrypt each item with a unique key and destroy the keys, rather than the underlying item, in a delete event, such that not even forensic tools would have a reasonable chance at recovery once the key-storage media has been re-written.

      --
      Give a man a fish, he'll eat for a day, but teach a man to phish...
    7. Re: Delete? by MightyYar · · Score: 1

      That is certainly a very secure way to do it - but of course they probably would have a backup of their index, and thus the keys. At some stage, you have to declare "good enough!" - and for 40+ years the removal of the index entry has been a "delete". We go to extra lengths for a "secure" delete, and they would have to take some extra steps here as well... but it is hard to speak intelligently without knowing the details :)

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    8. Re: Delete? by PvtVoid · · Score: 1

      Another implementation would be to encrypt each item with a unique key and destroy the keys, rather than the underlying item, in a delete event, such that not even forensic tools would have a reasonable chance at recovery once the key-storage media has been re-written.

      Then you'll need cold storage for all those keys you never use. Which, of course, can't be deleted unless they're encrypted with yet more keys, which will themselves need cold storage, so you have to....

    9. Re: Delete? by devilspgd · · Score: 2

      No, keys are small enough to store without needing cold storage.

      --
      Give a man a fish, he'll eat for a day, but teach a man to phish...
    10. Re:Delete? by Rasperin · · Score: 2

      What the heck are you talking about?! Privacy advocates for years have been screaming about Facebook and their stock remains just as strong.

      The average person doesn't care about long term punishments when the short term gains are attractive. This is why I use Facebook. But I treat Facebook like a loud speaker, it's a great place to share my idiotic ideals but I try to avoid saying anything damaging/damning. (btw, this is what we call acceptance, I long ago welcomed our new Facebook overlords).

      --
      WTF Slashdot, why do I have to login 50 times to post?
    11. Re:Delete? by ksheff · · Score: 1

      and once all the images that's on a disc are not linked to anything anymore, it can be shredded. Facebook may also have a plan to perodically copy data from older bluray discs to newer ones. When that is done, they can just copy the images that have links to the new disc.

      --
      the good ground has been paved over by suicidal maniacs
    12. Re:Delete? by dave420 · · Score: 1

      They remove the image from the catalogue. The image still exists on disc, but won't be copied to new media or be available for retrieval. This is a solved problem, and has been for decades.

    13. Re:Delete? by ConceptJunkie · · Score: 1

      I'm glad to see someone besides me on /, isn't terrified of Facebook.

      I use it and I think it's relatively harmless as long as you understand, as Rasperin says, it's a loud speaker. I expect everything I post on FB will be available to everyone, everywhere, forever. I long ago, many years before Facebook was a thing, figured out that if I never posted anything online I wouldn't want my sainted mother to see, I'd never have anything to worry about*. I speak my mind freely, but I would have no problem if my mother, my wife, my boss, my kids or my pastor were to see anything I've posted.

      * Now, of course, that doesn't mean some day in the near future agents of the Ministry of Love won't show up at my door to conduct me to a re-education camp for my political views, but at least I know my mother won't be ashamed of me.

      --
      You are in a maze of twisty little passages, all alike.
  4. Title is wrong by neilo_1701D · · Score: 2

    Should have read:

    You won't believe this one weird trick Facebook uses to store data!

    Other than that, fascinating look at how all that data is being stored and retrieved.

    1. Re: Title is wrong by Daniel+Hoffmann · · Score: 1

      https://xkcd.com/908/

      There is a lot of caching

  5. All I can say... by vortex2.71 · · Score: 1

    is that their monthly AWS fees must be ENORMOUS!

  6. computer output to laser disc by maxwells+daemon · · Score: 1

    is it cold in here or what?

  7. Going on for a while by Alomex · · Score: 3, Informative

    I've noticed large latency for rarely used pictures in FB for over eight months now, and by large latency I mean visit the page, then come back the next day to see the next batch of > 5 year old pictures and wait another day for the final batch of ~10 years ago pictures.

  8. There are not 900 million unique pictures per day. by Anonymous Coward · · Score: 1

    People upload the same memes all the time. Just hash and store the common images and you'll reduce the unique photos to one or two unique images per day. :)

  9. Re:All I can say...(question) by willworkforbeer · · Score: 1

    An interesting question is at what point does it become viable for FB to follow Amazon's model to scale its own system as a business unit...

    As in, when will FB conclude that it again needs to widen its revenue stream portfolio, and it therefore makes sense to offer its own version of AWS?

    Any predictions on FBWS?

    And there's the FB hardware development division, a business unit that so far has also remained in-house but has its own revenue potential. I think people tend to underestimate MZ's ambitions to leverage the FB core to create a broad spectrum business. (following Google's leverage of search revenue to devour the advertising business, etc., and Amazon's leveraging of book sales to devour retailing and then logistics, etc.)

    --
    Pretending this is my office full of bitter coworkers..
  10. Re:How do they do this? by pla · · Score: 2

    I think he just means that optical disks don't have quite the same amount of inertia or need to do internal self-checks as HDDs, before you can actually access them. They still "spin them up", but it happens in a few ms rather than on the order of 5-15s.

  11. Amazing by bws111 · · Score: 3, Insightful

    Wow, they discovered HSM only 40 years after it was introduced. Amazing.

    1. Re:Amazing by Ravaldy · · Score: 2, Insightful

      Pointless arrogant comment.

      Nobody claimed it was new or that they had reinvented anything. They just applied modern technology to a well know strategy to solve a known problem. In the modern age of storage and data centers I have yet to see this (not to say that nobody has done it).

      When someone shows you an electric car do you tell them cars have had 4 wheels since before 1903? I assume you do.

  12. Re:Does Facebook compress pictures? by aitikin · · Score: 1

    Facebook has always compressed pictures. Nothing is stored at full size.

    --
    "Don't meddle in the affairs of a patent dragon, for thou art tasty and good with ketchup." ~ohcrapitssteve
  13. Re:Shoeboxes by Viol8 · · Score: 1

    Joking aside - its always good practice to have electronic AND hard copies (optical disc, microfiche paper) of all critical data including copies off site. That way even if some hackers from somewherestan manege to totally trash the companies electronic systems the data can still be recovered.

  14. Facewhat? by JustAnotherOldGuy · · Score: 1

    Is facebook still a thing? People still use it after all the security problems and personal information screw-ups?

    --
    Just cruising through this digital world at 33 1/3 rpm...
  15. blu ray? by stoned_ritual · · Score: 1

    How is using blu ray cheaper than hard drives? Not only is it slower, but the medium + the hardware to burn them + the robotic retrieval system..
    Seems like there could be an easier solution to this: hard drives in racks. No robots, no optical drives, and no blu ray discs.
    One 500gb hard drive already has 10x the amount of storage as a dual layer bluray. In fact, a 10 pack of dual layer blu ray discs on amazon costs twice as much as a 500gb 3.5" drive. Am I missing something?

    1. Re:blu ray? by jp10558 · · Score: 1

      Electricity use - did you even watch the video? Of course not. Also, the data survives a drive failure.

      What I wonder is why they think this is better than LTO6, which already has robots etc COTS solution. It's possible, maybe, that it takes less space. It is resilient to stray magnets in a way tapes maybe wouldn't be - but is that a common issue with LTO?

      --
      Opera, Proxomitron-Grypen,GPG 0x0A1C6EE3
    2. Re:blu ray? by stoned_ritual · · Score: 1

      No I didn't watch the video. I'm at work. But I did skim the article, and apparently I missed some details, because the comment police are crawling all over the place.

    3. Re:blu ray? by ncc74656 · · Score: 1

      How is using blu ray cheaper than hard drives?

      3 TB will fit on 120 25-GB BD-Rs. At 40 cents each, that's $48 in media costs. If you do like I do and reserve 20% for dvdisaster error-recovery data, you're still only looking at $60.

      A 3 TB WD Green will set you back $95. (Want to spring for the NAS-rated Red drives instead? That'll be $119. Their absolute cheapest 3 TB hard drives are a couple of models from Seagate and Toshiba at $90 each.)

      --
      20 January 2017: the End of an Error.
    4. Re:blu ray? by Whiteox · · Score: 1

      You are right. There's no reason for why you can't 'spin down' a rack of cheap server grade HDs to save power.
      What happened to Bernoulli disks anyway?

      --
      Don't be apathetic. Procrastinate!
    5. Re:blu ray? by dave420 · · Score: 1

      Plus these discs are rated for a much longer shelf life than HDDs.

  16. Re:Shoeboxes by Ravaldy · · Score: 1

    What critical data? Personal? Business?

    At what point is it critical enough to go out of your way to store terabytes of data on CD/DVDs? Isn't an offline HD good enough?

    I have done the following for a long time and I believe this is more than enough for most businesses
    1. Backup to NAS (or equivalent)
    2. Backup to offline disk (done monthly but could be done more often depending on business requirements)
    3. Offsite Backup on the west coast (We are on the east coast)

    At what point are you spending too much money securing data?
    At what point are you being paranoid?

    Those are all questions that will have different answers depending on the company and it's IT/Ownership.

  17. This article gave them free load testing by Barlo_Mung_42 · · Score: 1

    First thing I did was open facebook and look to see what my oldest picture was. I don't have that many and it came up pretty quickly but I'm sure lots of other people had the same impulse.

  18. Re:There are not 900 million unique pictures per d by bigfinger76 · · Score: 1

    Replace images of people's food with a stock image, and they could dispense with this whole system.

  19. Wait a minute... by bigfinger76 · · Score: 1

    Didn't we see a story about this last year?

  20. FB hardware may be lucrative... by mlts · · Score: 1

    It might be that using Blu-Ray autochangers may be a very useful thing to have, especially for something that can fill the gap between HDDs and LTO tapes for backups [1].

    The pathetic thing is that this technology isn't new. We used to have 100, 200, even 400 disk CD and DVD carousels. By replacing the CD reader with a burner, and using 128 GB BDXL media, that means tens of terabytes of tamper-resistant (important with all the ransomware out there) WORM storage.

    The trick is getting BD media into the terabytes and getting it at a price point where it is decently affordable. For example, a 100 GB BDXL disk is $65, but it should be about 10% of that price in order to be a viable backup medium.

    [1]: The cloud isn't an option in a number of cases (WAN bandwidth isn't cheap), and it is only a matter of time before a major provider gets hacked.

    1. Re:FB hardware may be lucrative... by willworkforbeer · · Score: 1

      I was actually thinking of the potential of FB's networking hardware initiative, though you make a good point about this storage angle.

      --
      Pretending this is my office full of bitter coworkers..
    2. Re:FB hardware may be lucrative... by ncc74656 · · Score: 1

      The trick is getting BD media into the terabytes and getting it at a price point where it is decently affordable. For example, a 100 GB BDXL disk is $65, but it should be about 10% of that price in order to be a viable backup medium.

      My last spindle of 25 GB BD-Rs cost me maybe $0.60 each or so. I could drive down to Fry's right now and pick up a spindle for about $0.80 each. A 4x increase in storage density isn't worth a two-order-of-magnitude increase in price. I would be surprised if Farcebook didn't arrive at the same conclusion.

      Going by the numbers from the video in TFA, they're getting over 10k BD-Rs in a rack. While the basic concept isn't new, they appear to have developed it to a considerably higher density.

      --
      20 January 2017: the End of an Error.
  21. Duh by wonkey_monkey · · Score: 2

    In the cloud, obvs.

    --
    systemd is Roko's Basilisk.
  22. shoulda read the article, bro by Ionized · · Score: 1

    it would have answered your questions, and you wouldn't have looked like a tool, and i wouldn't have mocked you. the world would have been a better place! if only.

    1. Re:shoulda read the article, bro by stoned_ritual · · Score: 1

      Yeah, you sure showed me.

  23. Re:Shoeboxes by mlts · · Score: 1

    What I don't get is why FB doesn't just use tape. Tape drives are expensive, but the media itself is cheap -- LTO-4 cartridges are $15 apiece, and tape is a true archival grade media.

    Plus, with tape, you copy it to that, yank the tapes out of the autochanger, and toss them in an unused corner of a room. Tapes take 0 watts in storage (other than what it takes for HVAC), so other than physical access concerns, they are easily stashed and will remain usable for quite a long time.

    If any industry needs a kick in the pants with regards to capacity improvements, it is the tape media industry. A tape has far more area to put data on than a HDD platter, so there is a lot of room to add capacity, as well as reduce price with cartridges and drives, especially if mass produced so economies of scale kick in. Back in the 1990s, almost any business had some form of tape drive, which worked fairly decently for backups (although 4mm/8mm drives are nowhere near as reliable as a LTO drive.)

    No, tape isn't trendy... but it functions well, and with WORM media or hardware write protection, it is resistant to malware. With hardware encryption in newer revs (LTO-4 and newer), it is trivial to just set a password and call it done when it comes to that security... that way, if a tape falls off the Iron Maiden truck, it is just a hardware loss... no worry about compromised data.

  24. You don't think by WillRobinson · · Score: 1

    The nsa built that huge data center in Utah for nothing?

    Now if the nsa would just open an api to retrieve it....

  25. Re:Does Facebook compress pictures? by Whiteox · · Score: 1

    They resize them first, then compress. A 3~5mb pic is stored around 10% of the uploaded size.

    --
    Don't be apathetic. Procrastinate!
  26. Re:Shoeboxes by Ravaldy · · Score: 1

    What I don't get is why FB doesn't just use tape

    Because of the seek time. They still want the content available and the BlueRay method yields a 10 second delay from what I read (I may have read that wrong).

    Plus, with tape, you copy it to that, yank the tapes out of the autochanger, and toss them in an unused corner of a room. Tapes take 0 watts in storage (other than what it takes for HVAC)

    They can't just toss it. That's the whole point of the article. They still need access on demand.

    I think that the BlueRay solution is cheap too. The article was making reference to how much colder that area was (because of the lower HVAC requirements I assume).