Backblaze's 6 TB Hard Drive Face-Off
Esra Erimez writes: Backblaze is transitioning from using 4 TB hard drives to 6 TB hard drives in the Storage Pods they will be deploying over the coming months. With over 10,000 hard drives, the choice of which 6TB hard drive to use is critical. They deployed 45 and tested Western Digital (WD60EFRX) and Seagate (STBD6000100) hard drives into two pods that were identical in design and configuration except for the hard drives used.
I don't know... I find it odd that the WD drives, at the 5400rpm speed, were able to write data faster than the 7200rpm Seagate drives. That seems counter-intuitive.
It's also nice to see all of the drives go through that sort of "punishment" without a single failure - out of the box. NewEgg reviews aren't terribly helpful, since most only leave reviews when they have issues, and only a few customers ever bother to leave good reviews unless they are overwhelmed by the quality of a product.
- Initial reliability (how many drives failed) – No failures.
- Running reliability (3 months) – No failures
- SMART Stats (3 months) – No error conditions recorded for the 5 stats that we utilize.
- Hard Drive Cost – about the same.
- Energy Use – The Seagate drives were 7200 rpm and used slightly more electricity than the Western Digital drives which were 5400 rpm. This small difference adds up when you place 45 drives in a Storage Pod and then stack 10 Storage Pods in a cabinet.
- Loading speed – Edge to Western Digital, by a little over 1 TB per day on average.
Slashdot, fix the reply notifications... You won't get away with it...
That was about the most useless set of HDD statistics I've ever seen. You don't need more than one drive each to compare power consumption and performance.
So you think there's 0 variance?
NOTHING was said about reliability and who cares how much data was stored on them vs how long it was in service. Those two numbers are completely arbitrary.
45 drives each, no initial failures, no failures in the first 3 months. Right there that tells me the WD Red 6TB drives are hugely better than the 4TB drives I used.
I remember punching the side of 360K floppies to get another 360K on the other side.
Now you can buy a couple of gigs of USB drive next to the gum in the express lane at Wal Mart.
This stuff is awesome and all, but sometimes it's hard to really wrap my head around that pretty much everything about computers (except for physical size) is a billion times bigger than when I started using computers.
It really is hard to explain to people that at one point your entire digital life was about 20 floppy disks in a plastic case, and that what was once a completely hypothetical amount of storage is commonplace.
Lost at C:>. Found at C.
I think you missed the point. Several points, in fact...
Backblaze doesn't care about one drive. Power consumption is a complicated matter, and they have a very simple plan, so it's best for them to build a full pod for testing, and compare the power and performance at the pod level. They can extrapolate that out to their planned expansion considering pods as the units of measure, rather that having to consider drives, controllers, fans, and power supplies as extra variables. That simplification is partly why they're using a pod architecture in the first place.
Reliability doesn't matter much to Backblaze, either. They store redundant copies of data, so their risk of loss is mitigated, jjust as it should be for any enterprise use of such drives.
When you ask "who cares how much data was stored on them vs how long it was in service", clearly the answer is Backblaze, because they cared enough to study that particular metric.
Now, all of this is really only obviously useful to Backblaze. They're running tests in their environment, with their design, for their criteria. Realistically, the vast majority of Slashdotters won't ever handle anything like Backblaze's system, so they have different priorities. Backblaze still released their test results, just in case anyone cares. That's why they've gathered such a following among nerds. They've repeatedly published their research openly, contributing to the public knowledge base for system engineers. Maybe somebody finds it useful, and maybe not, but it's still a noble principle they practice.
You do not have a moral or legal right to do absolutely anything you want.
As much as I'm sure you're right, I think this is a great way to perform advertising. No flash animations, no autoplay video or sound clips, no clickbait... Just pure data-driven performance benchmarking. It's like they're saying "Let's attract tech-savvy customers by publishing something that will actually be informative and/or interesting to them, and then maybe some of them will be interested in what we sell" I can totally get behind this form of marketing!
Disclaimer: I work at Backblaze.
> They've repeatedly published their research openly... just in case anyone cares.
"Research" sounds too official, more like "observations in our environment", but THANK YOU for the kind words. What baffles me is why nobody else publishes these sorts of drive statistics. Why is Amazon silent? Why doesn't Google name drive names and failure rates? And if the answer is: "Google gets a great price on drives in exchange for their silence" then why hasn't Backblaze been offered a deal to keep quiet yet?! I'm serious, how big do you have to get before you get the better prices on drives? We essentially pay "retail".
Disclaimer: I'm an engineer at Backblaze.
We do these drive statistics and observations originally for our own selfish internal reasons - this is information that is important for running our business. When we then release this kind of information, the info release is largely because it helps people hear about our company (and also maybe a little of "good for humanity" motivation thrown in there, we're Slashdot kind of people, we work in technology in Silicon Valley). But let me be clear: the information is as accurate as we can possibly make it, and we aren't pulling any punches and we aren't "in bed with" any drive manufacturers. I see this as a WIN-WIN. You get accurate and free information, and a few people hear our company name and look into what we do and maybe we gain a few customers. These posts are often written by the engineers working on the system and are trying to be as straight-forward and non-marketing as we can be.
Well, if your sample N is 40,000 drives as theirs has been in the past, and you're operating with reasonably rigorous methodology to track problems, then you've got a good case. Write up your experience, and note N. (For 6TB drives, their N is very pretty small, and even moving forward they're only adding 230 WD drives).
I don't think you've got a good case to argue that a sample of 40,000 drives is "noise", but you could well be right about the much tinier smaller samples for 6TB drives. Assuming you've got tens of thousands of Seagates being heavily used, if your results differ from their past ones, that would be very interesting. Publish.
About the only takeaway there is that WD loads faster (about a TB/day, an unexpected result) and uses slightly less electricity.
> Their backup scheme require them to have access to your private key (the one you encrypted your backup with).
Disclaimer: I'm a Backblaze engineer who wrote a lot of that code.
Your statement is a bit misleading, there are two levels of security in Backblaze. Data is always encrypted, and the "private key" is a totally standard OpenSSL PEM file that yes, we store for you. By default, this PEM file is secured by a passphrase that Backblaze knows, so your data is essentially only secured by your email address and password and you can recover your password by email. This is pretty light security (if somebody has access to your email they can recover your password), so it's best for backups of stuff you wouldn't mind too much if somebody got ahold of it, like say pictures of your cat. Don't laugh, I backup my public website on Backblaze servers, there is valuable data in the world that does not need encryption, that would be info you don't want to lose but is ALSO publicly readable.
So if you are concerned at all about security, you can set your own personal "passphrase" on that PEM file that Backblaze absolutely never writes to disk - we don't store it. But if you do this you MUST remember that passphrase or your data is GONE. Without that passphrase, nobody will ever retrieve your data, not you, not the US government, not the NSA, NOBODY. You cannot "recover" that passphrase, and we don't know it. This is a good mode of security if you would be arrested on the spot for the contents of your files if the NSA got ahold of your data, because we really don't think it is breakable.
Seagate isn't using SMR on the 6TB drives, at least not yet as far as I know. That's rolling out with the 8TB models.
> retail at the 10,000 drive order level
You might be surprised how little discount we get. Our last purchase of 4 TByte Hitachi drives (960 drives in one purchase) we paid $135 each before tax and shipping. "B&H Photo" sometimes wins the bid (I don't know how or why), but you can basically get that same price within a couple bucks in units of 1 or 2 from their website. Note: we have no affiliation with B&H other than satisfied customers, and B&H do not win the bid every time.
With that said, if anybody knows how to get more than $2 off "retail" please PLEASE let us know!!
> I'm surprised Backblaze has published so much without getting into lawsuit trouble already.
:-) Plus I think the drive companies are aware of the "Streisand effect" https://en.wikipedia.org/wiki/... and don't want to call even more attention to the fact that every hard drive is fully expected to fail eventually.
Hopefully "the truth" is a valid defense?
I'd love to be able to publish these statistics for our organization, (I'd estimate we have close to a quarter million drives in the field) but there is a big hurdle in the way: legal liability. If I was to say something negative about Western-Sea-Tachi drives, their lawyers might call our lawyers, and we could easily spend a million in court fees.
The thing I think would be interesting is that we have a completely arbitrary mix of drives, based on drive availability over the last 6 years or so. We also have a mix of different service companies who replace the drives in our workstations. Our contract is such that we don't control the brands, or even the sizes, as long as they meet or exceed our specs. As a service organization, they're responsible for picking the cheapest option for themselves. If our spec says "40 GB minimum", and they can't get anything smaller than 500GB, they'll buy those. If 1TB drives are cheaper than 500GB drives, they'll buy those. And if we're paying them $X/machine/year for service, they can do the reliability decisions on their own, so if they think some premium drives will last two years longer than stock drives, they might be able to avoid an extra service call on each machine if they spend $Y extra per drive. I expect these service organizations all have their preferred drives, but that's not data they're likely to share with their competitors on the service-contract circuit.
John
I work at Backblaze.
Then you boys should make an app that every computer enthusiast can use that tracks smart stats/drive failures and collects them at your servers. It'd be great to monitor drives across the internet with an application that you could just have minimized to the taskbar, maybe you could kickstart the funds for one? Many of us would gladly pitch in to get reliable drive data on a massive scale. Many of us are on the net anyway it would be great to report drive usage/characteristics in realtime across the internet.
I've been using stuff like below to "wing" whether a drive needs to be replaced or not, but usually drives start clicking before they go.
http://panterasoft.com/hdd-hea...
Hopefully "the truth" is a valid defense?
Libel and slander against an individual is generally invalidated if you're making a truthful and factual statement. There are exceptions, like when there is intention of malice. And the minute you layer any opinion onto what are straight facts, you're in fuzzy territory.
And statements published by a company about another company are not necessarily protected by the sort of free speech guidelines that guide individual interaction. I don't claim to know those rules. No larger company would publish this sort of information without passing it through legal counsel first to figure it out. And that overhead influences why those companies just don't bother.
The most common reasonable criticism of Backblaze's reports I've seen is that the drives are not being used in their intended environment. I would not want to be part of a legal defense where I had to legally prove the data originating from that use case is strictly factual commentary about the product.
"Research" sounds too official, more like "observations in our environment"
Step #1 of real science.
You do not have a moral or legal right to do absolutely anything you want.
Well if what I read on one of the forums (sorry I can't remember which, may have been Tom's) by a person claiming to be a former Seagate employee is true? Seagate suckage makes sense and moreover we now know WHY it happened all of a sudden.
Here is the skinny, according to the insider when Seagate bought Maxtor instead of Seagate making Maxtor better? It brought Seagate down to Maxtor levels. You see Maxtor had these ARM controllers that were dirt cheap to crank out, catch was you had to keep 'em in 5400 RPM drives (and even then they better be well ventilated) because if they got hot 1+1 could equal anything from 1-5 and so the controller would lose its little mind and forget where the end of the drive was and the drive geometry. Of course all the Seagate execs saw was $$$ on how much they'd save on the cost of manufacture so they started using them on ALL Seagate drives and...you know the rest. The reason why you can see a dozen shitty and one good in every batch is that when they run low on the shitty Maxtor chips they will occasionally use some of the more expensive Seagate chips, hence the pearl among the poop.
But after reading that it all made sense, the sudden plummet in quality, why drives below 750GB are good (those chips are based on the old Seagate chips and code) and why we see a good one among the crap, you get lucky on the ARM lotto. It all sadly makes sense, just more short sighted corporate douchebaggery.
ACs don't waste your time replying, your posts are never seen by me.
I find it odd that the WD drives, at the 5400rpm speed, were able to write data faster than the 7200rpm Seagate drives. That seems counter-intuitive.
If there are less platters in the WD then the density will mean a speed boost even at a lower spin speed.
I would personally like to see Western Digital sue Backblaze claiming that the WD Red drives specifically designed for NAS are not being used in their intended environment.
As for the criticism, I don't think there's such a thing as an intended environment for a HDD other than ruggedly mobile or stationary. I'm typing this from a laptop right now. Who is a harddrive vendor to say the level of vibration, temperature or movement my laptop experiences? At the same time I want those vendors to come out and tell me how their harddrives are not sitting in their "intended environment" when they are in a fixed rack serving up data, being kept in stable environmental conditions.
Personally I think the criticism is bullshit.
The intended environment for WD's drives includes a description of how many drives should be in the array. They are numbers like "NAS with 1 to 5 disks". They state that the lower tier models will not work well inside of massive arrays, where things like vibration need to be better controlled. Their more expensive models have specific technology (at an extra cost) aimed at keeping vibration related issues under better control.
BackBlaze ignores those guidelines, putting drives that were not designed for the vibration of a dense drive array into one. When Backblaze drives fail, it's completely appropriate to ask "would they have failed there if they were used only as specified"?, which means putting them into smaller arrays. There's a very real possibility that the failure rate heavy reflects that unusual setup, and that it is not representative of reliability for the disks in other environments.
"NAS with 1 to 5 disks" is not an environmental spec.
The number of discs does not relate to the vibration or heat or any other factors. Those can only be measured directly. Now if WD specified that drives should not be placed in an environment where they will be subjected to x um vibration measured to some ISO standard then I would be right there with you.
How do 1-5 disks compare to a computer with 5 poorly balanced fans?
How do 1-5 disks compare to a single metal enclosure direct mounted, vs disks mounted via rubber grommets?
finally:
How do 1-5 disks placed horizontally next to each other or double stacked compared to drives mounted vertically and held in place with an anti-vibration sleeve such as the one used by Backblaze which they posted gave them a measurable performance improvement?
Even some braindead lawyer could point out the difference between a direct measurable specification and the completely subjective "NAS with 1-5 disks"
And as a side note Backblaze see no reliability differences between their consumer and enterprise grade drives, of which they have several thousand.
The number of discs does not relate to the vibration or heat or any other factors.
They are correlated. More discs guarantees more vibration and heat, all other things being equal. Yes, there are other sources too, and all the other things are not equal. So what?
That you are calling ""NAS with 1-5 disks" a subjective specification means you're not actually using words in a way I can respond to there. Whether Backblaze's custom modifications net better or worse levels of vibration is a complicated discussion that could use some direct measurements; agreed. But what's extremely clear is that they are not using the consumer drives in anything like a consumer environment. That means using their results as a commentary on what people will see in the broader consumer system market is extrapolation, with the obvious risks that come along with it.
For example, "Backblaze sees no reliability differences between their consumer and enterprise grade drives" is a fact. Saying "there is no reliability differences between consumer and enterprise grade drives" is an invalid extrapolation of that data.
Using your example, what if one of the consumer drive models has a serious vibration issue, and Backblaze's anti-vibration sleeve makes it wildly more reliable than it would otherwise be? That would make their statistics pretty worthless for consumers who don't have one of those sleeves. Home users might actually see better reliability with one of the enterprise drives that include anti-vibration technology in that case. That's all I was saying here--that you can't just assume their numbers will translate into other environments.
> Then you boys should make an app that every computer enthusiast can use that tracks smart stats/drive failures and collects them at your servers.
I think this would be really awesome. Here's where it gets neat-> we already have an app running in hundreds of thousands of desktop and laptop computers! (Our "online backup application" involves a tiny service that runs to send your files at the datacenter through HTTPS.) So if we just updated the client with a small amount of statistics tracking (and maybe a nice checkbox to opt in or out) then we could immediately start collecting info.
Sort of related: A few years ago I played an online 3D video game (can't remember which one, might have been Quake?) that you could both report your graphics card and RAM configuration to the server, and the server would list the aggregate statistics. So there is some precedent for this kind of data collection and publication.
They are correlated. More discs guarantees more vibration and heat, all other things being equal.
No they aren't. It's an indirect association combined with a lot of subjective assumptions. Are vibrations in phase or out of phase? Are they co-coherent? Having 2 vibrating sources does not guarantee an increase in environment vibration. You can have anything from an doubling to a complete elimination to changing frequencies with no change in magnitude. Regardless of what you say measuring vibration in "number of disks in a NAS" is not at all any kind of engineering specification. It's like measuring load capacity in an elevator in persons without defining what a person actually weighs, which is also why it's not legal to write just the number of persons on the load capacity.
The reality is their pods will have wildly different vibration characteristics and I say this as a person who has has spent the best part of the last 4 years working in industrial vibration monitoring. No two machines vibrate alike regardless of what environment you put them in. That's why predictive maintenance is not done based on absolutes but relative measurements. It is very possible that one of Backblaze's pods will vibrate itself to all crap, while another will experience even less vibration than a single drive PC.
But what's extremely clear is that they are not using the consumer drives in anything like a consumer environment.
Really? Because I keep my harddrives in a metal box with fans and sources of heat. Don't you?
For example, "Backblaze sees no reliability differences between their consumer and enterprise grade drives" is a fact. Saying "there is no reliability differences between consumer and enterprise grade drives" is an invalid extrapolation of that data.
You're right. If you extrapolated the data properly you're saying that consumer drives are far more reliable than enterprise drives given that Backblaze sees no difference while running the enterprise drives within spec and the consumer drives outside of your mythical spec. I never thought about it that way.
Using your example, what if one of the consumer drive models has a serious vibration issue, and Backblaze's anti-vibration sleeve makes it wildly more reliable than it would otherwise be? That would make their statistics pretty worthless for consumers who don't have one of those sleeves.
This is true but in any case you're stretching here and moving the goalposts of the argument. The original argument was that Backblaze are using drive outside of "spec" and what this mythical "spec" actually means.
I'd be happy to agree with you on random distribution of environmental variables in other circumstances, but lets define those variables first. I propose we define the environment in number of vehicles with a GVM of 18T driving within 80m of the computer. It's about as useful as number of the drives in the array.
Anyway I'm more than happy to follow Backblaze's numbers and translate them into other environments. I'd be even more happier if you gave me some other data to work with. Because if we ignore Backblaze's numbers all we are back to is a 110 years MTBF if used with 1-5 drives in a NAS.
Actually maybe we should correlate it to solar activity. It would probably be more accurate than the manufacturer's useless numbers.