Backblaze Dishes On Drive Reliability In their 50k+ Disk Data Center

← Back to Stories (view on slashdot.org)

Backblaze Dishes On Drive Reliability In their 50k+ Disk Data Center

Posted by timothy on Wednesday February 17, 2016 @05:50AM from the learning-from-experience dept.

Online backup provider Backblaze runs hard drives from several manufacturers in its data center (56,224, they say, by the end of 2015), and as you'd expect, the company keeps its eye on how well they work. Yesterday they published a stats-heavy look at the performance, and especially the reliability, of all those drives, which makes fun reading, even if you're only running a drive or ten at home. One upshot: they buy a lot of Seagate drives. Why? A relevant observation from our Operations team on the Seagate drives is that they generally signal their impending failure via their SMART stats. Since we monitor several SMART stats, we are often warned of trouble before a pending failure and can take appropriate action. Drive failures from the other manufacturers appear to be less predictable via SMART stats.

10 of 145 comments (clear)

Min score:

Reason:

Sort:

Re:Doesn't make any mention of.. by Anonymous Coward · 2016-02-17 06:11 · Score: 4, Informative

https://www.backblaze.com/blog/vault-cloud-storage-architecture/
They mention their architecture here
Re:RAID, let them fail by Dareth · 2016-02-17 06:18 · Score: 5, Insightful

The purpose of RAID is to keep data available for a purpose. You have some level of redundancy measured in terms of number of disk that can fail before you have a data loss for the array. Once a disk has an impending failure smart alert, you no longer have full confidence in that disk. If you leave it to fail, what if another disk in the array happens to fail. You now have an array with a failed disk, possibly in a degraded mode. You also have a disk with a better than normal chance of failure. It just makes sense to be proactive and fix the issue before it escalates into a failure.

--

I only look human.
My mother is a halfling and my dad is an ogre, so that makes me an Ogreling
Sorry WD fans by Solandri · 2016-02-17 06:21 · Score: 5, Interesting

Can't help but feel for all the people who read Blackblaze's previous report and decided Seagate was junk and bought WD instead. I tried to warn them that the model of the drive mattered more than the manufacturer, because each manufacturer tries new technologies and new cost-cutting strategies with each different model. Sometimes it works and the model is reliable. Sometimes it doesn't and the model is unreliable. But everyone was eager to get on the bash Seagate, praise WD bandwagon and ignored me.

Well, WD was least reliable this time around. The Seagate stats in the previous report were probably being skewed by just one or two bad models. It's skewed this time by one bad model, which due to the passage of time means it makes up a tiny portion of their Seagate sample, so doesn't spike Seagate's score like before. (You can pretty much ignore WD in the 4TB graph, as a sample size of just 46 drives means the confidence interval is a 0.3% - 8.8% failure rate.)

At least Blackblaze addressed my criticism from before - they've broken down the stats to individual drive models. And you can see that like I said, there's huge variability in reliability between models within a manufacturer's lineup. Now they just need to add confidence interval to the graphs.
Re:Seagate SHOULD be good at that by mattventura · 2016-02-17 06:29 · Score: 5, Funny

Seagates are great at reporting impending failures.
Does it say Seagate on it? It's about to fail.
Re:Not very useful. by Anonymous Coward · 2016-02-17 06:35 · Score: 5, Funny

Exactly, so even though these are the best large scale numbers we have, they are garbage. We shouldn't use them even though they are the largest sample size. They're useless like the people that carefully compiled these numbers. Instead, we should trust drive manufacturer's marketing numbers, as you suggest.
Re:Seagate SHOULD be good at that by slaker · 2016-02-17 06:41 · Score: 4, Informative

HGST drives are manufactured by a different division, using different processes and different engineering teams. I was told by a WD engineer that HGST stuff is still entirely separate on a manufacturing level.
Of course, I'm just some guy on the internet, but based on my own experiences with a few hundred 3 and 4TB drives in service, the Hitachi/HGSTs are worth going out of my way to obtain and Seagate 4TB drives don't seem to have the problems the 3TB units did.

--
-- I wanna decide who lives and who dies - Crow T. Robot, MST3K
Re:RAID, let them fail by sexconker · 2016-02-17 06:55 · Score: 4, Informative

Because you don't know how it will fail, you don't know what other drive may fail next, and you don't know when a 2nd, 3rd, nth, drive will fail.
Further. drives that manage to actually report that they're dying are typically fucked to the point of impacting your performance significantly. If you're still writing to a drive that's hobbling along, it will slow down the whole array.
Reads are usually okay (depending on your controller and setup) but writes need to be completed at some point, regardless of your cache scheme or cache size.
Sustained writes to an array with a crippled drive will eventually either result in the drive being taken offline or the array's write performance turning to shit. If you're lucky, the drive is taken offline gracefully, doesn't catch fire, and you do the hot spare / cold spare dance, the rebuild boogaloo, etc.
Bad sectors? by nbritton · 2016-02-17 07:00 · Score: 5, Interesting

What is Backblaze doing to check the drives for bad sectors? I manage a 10,000 disk openstack swift installation and I've noticed the auto sector remapping doesn't work correctly, there are a portion of drives (maybe 3%) that have a few bad sectors that need to be manually remapped using ddrescue. I ended up having to write a custom monthly cron job script that ran badblocks to first identify these drives, and then ddrescue to force a sector remap.
Re:Not very useful. by brianwski · 2016-02-17 08:15 · Score: 5, Informative

Disclaimer: I work at Backblaze.

> very unlike the type of use case you will likely see

Being extremely specific - we (Backblaze) keep the drives powered up and spinning 24 hours a day, 7 days a week. So if you leave your drives powered off most of the time and boot them only sometimes, the failure rates we see may or may not be something like yours?

I'm curious if anybody has any other suggested differences with "what you will see". Most of our drive activity is light weight - we archive data for goodness sake, we write the data once then maybe read it once per month to make sure the data has not been corrupted. We stopped using RAID a while ago, so you can't say you need drives that are designed for RAID, because we don't use RAID (we do a one time Reed-Solomon encoding and send it to different machines in different parts of our datacenter and write it to disk with a SHA1 on this "shard" where that shard lives it's life independently without RAID).

ANOTHER POINT MANY PEOPLE MISS -> you can't just pick the lowest failure rate drive and then skip backups!! *EVERY* drive fails, every single solitary last drive. So you must have a backup if you care about the data, you really really do. And if you have a backup, then you are free to choose a drive that fails at a higher rate if there are other considerations such as it is a much cheaper drive. Hint: Backblaze doesn't always choose the most reliable drive, we look at the total cost of ownership including the amount of power the drive will consume and the drive's failure rate and let a spreadsheet kick out the correct drive for us to purchase this month. It is rarely the most reliable drive.
Re:Doesn't make any mention of.. by brianwski · 2016-02-17 08:21 · Score: 5, Interesting

Brian from Backblaze here.

The individual drives in our datacenter run ext4 (the OS is Debian). We do an extremely simple Reed-Solomon encoding that is 17+3 (17 data drives and 3 parity) but the 20 drives are spread across 20 different computers in 20 different locations in our datacenter. This means we can lose any 3 drives and not lose data at all.

We released the Reed-Solomon source code free (open source but even better) for anybody else to use also. You can read about it in this blog post: https://www.backblaze.com/blog...