Facebook Experimenting With Blu-ray As a Storage Medium
s122604 links to CNN's explanation of what may be the future of cold (or at least lukewarm) storage at Facebook, which is experimenting with massive arrays of Blu-Ray discs for seldom-accessed user files. Says the report: The discs are held in groups of 12 in locked cartridges and are extracted by a robotic arm whenever they're needed.
One rack contains 10,000 discs, and is capable of storing a petabyte of data, or one million gigabytes.
Blu-ray discs offer a number of advantages versus hard drives. For one thing, the discs are more resilient: they're water- and dust-resistant, and better able to withstand temperature swings. Their data can be restored more quickly, and they're easier to transport.
Most important, though, is cost. Because the Blu-ray system doesn't need to be powered when the discs aren't in use, it uses 80% less power than the hard-drive arrangement, cutting overall costs in half.
... those drives offline or come up with a system to power up the drives via custom san hardware when you want to access them? With facebooks cash it should be do-able.
I know that enterprise grade hard drive are made to be spinning for years without fail, but there are hard drive that are made to be spun down and essentially powered off when idling. They are laptop drives. Again, not made for enterprise storage but neither is Blu-ray so I find it curious that this would be the USP of this solution.
- Henrik
- when the Shadows descend -
BD also cost more per-GB than a HDD.
I hope it gets those cartridges faster than RedBox.
Can I ask Facebook to delete my stuff from one of those (assuming I had a Facebook account in the first place)
Ideally you'd have two different types of media so they would degrade at different rates (which doesn't completely eliminate the possibility of synchronous degradation of any particular bit of information, of course. But helps)
That's not what the summary is saying.
Let's add up those bytes:
12 x 50GB (calculating with DL discs) gives 600GB/BR cartidge, or about the storage of a phisicaly smaller LTO3 tape with some compression. (LTO4 gives 800GB uncompressed) This gives 0.47PB of storage per rack.
LTO can be rewritten if needed. Of those you can pack 1320 tapes (IBM TS3500-S54 storage frame) frame for 3.2PB uncompressed data using LTO-6 tapes.
The BR discs can be a bit faster when retrieving many small files, yet I still wonder the logic here...
Enterprises have been doing this with tape for 30 years.
In fact, modern tape technology probably has a higher "volumetric" density than BD.
"I don't know, therefore Aliens" Wafflebox1
>"Their data can be restored more quickly"
Than a hard drive? I think not.
> "the Blu-ray system doesn't need to be powered when the discs aren't in use, it uses 80% less power than the hard-drive arrangement, cutting overall costs in half."
Say what? When my backup hard drives are not being used, they also use zero power because they are not plugged in. And when they ARE plugged in, they "power down" after a few min of no usage, which I think is like 1% of normal power.
The density of storage for bluray is also not better than hard drives, and the writing is much slower. I also don't see how transport is so much better than laptop hard drives. Bluray MIGHT be cheaper, depending on how you value your criteria... and the discs are more rugged (if that even matters).
When you first access this data, you have to sit through 42 previews before you get to it.
It was a joke! When you give me that look it was a joke.
I read TFA. They're not using them as "storage" in the sense of active, accessible storage. It's a backup system.
What they're trying is, instead of storing redundant copies of everything on multiple drives (for resilience and geolocality), they're keeping one copy live and keeping backups on blu-ray.
So there's never a latency of minutes while it loads data from Blu-Ray, you just might be routed to Siberia or something to get the one active copy. If that copy's bad, error (restore from backup during next nightly batch or something).
BD also cost more per-GB than a HDD.
fifty "25GB 4x BD-R Hard Coating" for $35 about 1250GB about 36GB a $
A 3TB drive (I would say the sweet spot) would be $100 about 30GB to a $
So blue ray is slightly cheaper per GB for me. I suspect in bulk the differences are bigger.
they had cold-storage CD jukeboxes at (well-known HVAC) back that far for old catalog crep. heck, they had rooms full of videotape carts in TV stations back that far... take your pick, VHS pro or Beta Pro. robotic storage is way old, just the medium changes, depending on what you are used to in your industry.
if this is supposed to be a new economy, how come they still want my old fashioned money?
How is this different from the last time the topic was on the front-page of /.?
http://hardware.slashdot.org/s...
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
"Those data demands will only increase with time, particularly as personal cameras and smartphones become capable of capturing higher-quality images."
From Facebook: "We automatically take care of resizing and formatting your photos for you when you add them to Facebook."
It little behooves the best of us to comment on the rest of us.
Since they are disk packs I bet they will be RAIDed which will help protect from bitrot.
Or is the stuff already not really important to keep more than one copy around
It's Facebook. I doubt it's of any importantce even to the op. At any rate, the NSA has a copy for backup.
--- Keep the choice with the user..
They ought to try bees. It's good enough for HEX.
If they only keep one copy, how do they detect and recover from bitrot?
Or is the stuff already not really important to keep more than one copy around
Data replication is an honest question. What if a copy was kept on spinning disks
and the Blue-Ray media was backing store for spinning media.
A RAID design for the future need not have equal access times for ECC, voting
and redundancy. It only needs to be reliable and the net sum of the parts
inexpensive. Data rates on and off a single Blu-Ray are consistent with very long
distance optical fibre data rates.
If I allow myself to think of this as heterogeneous RAID hardware design it makes sense.
If I allow myself to think of this as an isolated magic solution it seems fragile.
Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.
I'm sure there's other tape libraries with similar densities, but the IBM TS4500 (http://www-01.ibm.com/common/ssi/printableversion.wss?docURL=/common/ssi/rep_ca/2/897/ENUS114-072/index.html) high capacity frame (storage only) can hold 1320 LTO 6 tapes each with a 2.5 TB native capacity.
What does that TS4500 cost? I'm curious how it compares to a stack of dumb 16-bay SAS enclosures at $300 each.
http://www.ebay.com/itm/like/1...
A general purpose FreeBSD or Linux system with four raid cards can control 1024 drives mounted in such enclosures, so about $2 per drive for the intelligent bit.
I dunno. I've never been pleased with the performance of optical media. I'd think being in a data center, heating up and cooling down from usage and storage is going to have very bad effects on recordable optical discs (CDs, DVDs, Blurays). Not to mention, it's always a pretty well known fact, consumer recorded media (the ones with dyes and stuff) aren't terribly reliable in the long term. My personal experience with recordable optical media is poor at best, I have very very few discs that've remained readable and error free after just five years of relatively decent care and storage. And this is not even using them every day, heating them up and cooling them down, just stored in a dark cool place.
Seems... overhyped. I simply can't come to believe this is an actual viable storage medium for any kind of large scale operation. But enh, if it works for them, good deal. Seems like you'd get more bang for your buck using high capacity tapes which hold up much better to heating up and cooling down.
The power saving claim also seems silly. This could be easy done with standard hard drives in a cartridge type system they're saying they're using, powering down unused drives and putting them into a storage position (though for me, I think it'd be much smarter to make the connector the moving part and just plug into the right bank of HDs, instead of moving HDs around in a cartridge.)
The more I think about this operation, the less intelligent and efficient it seems to be.
Not that I had any trust in them anyway.
Blu-Ray, and indeed any modern optical storage, is very short-lived precisely because it's designed to be cheap. The laser disks used to store the Doomsday Project in Britain were still readable after 20 years. Modern optical storage decays typically within 5. Less, as the density goes up. And failures take out far larger percentages of the storage.
Magnetic tape is still the only trusted long-term backup medium. I wouldn't suggest it for something like Facebook purely because of seek times, but it's hard to think of any viable alternative.
With Blu-Ray, to guarantee to avoid complete disk loss, you'd have to be re-archiving the entire archive annually. That adds an enormous invisible cost to the project. They're not going to do that. Which means there's guaranteed loss of backups. How much depends on the exact storage conditions but it won't be pretty.
As for better ability to withstand conditions, it again comes down to the nature of the storage. Optical disks are highly vulnerable to a lot of things that hard drives are not. Overall, optical storage usually performs very badly in comparison, as the things hard drives are vulnerable to are cheaply avoided but the things optical storage can be attacked by are usually a lot harder to deal with.
I'm sure you're aware that none of the above formats (tape included) are considered "archival quality" - they just don't have the sort of durability required by that categorization. No known digital format does and there's nothing you can do to stabilize them. It's a big research area. For now, tape is considered the only method that is economic and durable, with the lowest loss of data per failure.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Waiting for j-43289.ar-298.bluray.facebook.com...
Firstly optical media doesn't suffer from bitrot to the same degree magnetic drives do. (There can still be damage/decay to the optical storage layer but it's much slower than magnetic disks.) Secondly, RAID doesn't protect against bitrot, that's the problem with it. Unaware filesystems have no idea the file has degraded (Ext3, HFS, NTFS, FAT32, exFat, etc.) The raid controller will then happily either A) happily copy the rotting data to the parity drive or, B) if it happens later, the array won't know which copy was the one affected by the bitrot. (No process touched the file so mod dates are worthless for comparison) The filesystem has to explicitly have file level checksumming built in (Btrfs, ZFS, etc.) That can then work across a raid array, but it's the FS, *not* the array providing the protection.
Okay, so we need disc 101 from tray 1010101 and the robot arm is busy, three other fetches already in the queue. After 30,000ms client Javascript times out and substitutes a "retrieving data, re-try for a few minutes" place holder, sets a longer camp-on timeout and releases the request.
The reason the robotic arm is busy is that despite random assignment to storage pools with some localized album grouping, web crawler activity for public albums, and bulk pre-fetch requests for semi-private albums by browser plugins run by logged-in users (which became more popular as access time increased) ... the lukewarm storage facilities are running hot and queues are full most of the time.
Despite the polished and smoothly functioning presentation that encourages the users to "just wait a bit" ... a dark rumor grows deep in the hearts of many that the data is not merely delayed, they must brush off dust and cobwebs, or root for it because it had been haphazardly tossed into a pile of rubbish somewhere, relegated to the digital Basement. Facebook does not think your photograph is of sufficient merit. Grandmother has long passed and you had not wished to look at her last week, so... why should you be interested now?
The effects are complex, but the cause is clear: the Internet is perverse. It re-routes around any attempt to take immediate access data off-line by degrees, accomplishing this through a series of countermeasures such as unwelcome crawlers depleting your cache, hitting your 'public' cold data systematically and regularly, then finally bankrupting your company as users migrate to another service whose superior performance does not arise from superior engineering -- merely the fact that fewer users are using it.
So the moral of the story is, if you are Facebook and wish to remain so, you will either strive to find a way to keep the random access time for everything down below 2000ms -- or die.
And also, Facebook would be wise to heed the following:
once / forgotten by tourists / a bicycle joined a herd of mountain goats /// with its splendidly turned horns / it became / their leader /// with its bell / it warned them / of danger /// with them / it partook / in romps / on the snow covered / glade /// the bicycle / gazed from above / on people walking; / with the goats /// it fought / over a goat, / with a bearded buck /// it reared up at eagles / enraged / on its back wheel /// it was happy / though it never / nibbled at grass /// or drank from a stream /// until once / a poacher / shot it /// tempted / by the silver trophy / of its horns /// and then / above the Tatras was seen / against the sparkling / January sky /// the angel of death erect / slowly / riding to heaven / holding the bicycle's / dead horns //////~Jerzy Harasymowicz
<blink>down the rabbit hole</blink>
How big a stack do you need to match a 1320 tape library? Even using 4TB disks you're talking 825 disks, which means 51 enclosures. And then four racks to hold those enclosures. And enough floor space to hold those racks. And enough circuits to power those racks.
At that level of scale, tape is simply a better option for archival storage.