How To Add 5.5 Petabytes and Get Banned From Costco
concealment writes with this extract from GigaOm: "'We buy lots and lots of hard drives . . . . [They] are the single biggest cost in the entire company.' Those are the words of Backblaze Founder and CEO Gleb Budman, whose company offers unlimited cloud backup for just $5 a month, and fills 50TB worth of new storage a day in its custom-built, open source pod architecture. So one might imagine the cloud storage startup was pretty upset when flooding in Thailand caused a global shortage on internal hard drives last year. Backblaze details much the process in a Tuesday-morning blog post, including the hijinks that followed as the company got creative trying to figure out ways around the new hard drive limits. Maps were drawn, employees were cut off from purchasing hard drives at Costco — both in-person throughout Silicon Valley and online (despite some great efforts to avoid detection, such as paying for hard drives online using gift cards) — and friends and family across the country were conscripted into a hard-drive-buying army."
Unlimited storage for $5/mo? I have to get on this shit.
.. buy direct or maybe some wholesale? Is such deliberate effort worth the actual cost?
How does that company stay in the black? Whatever, just goes to show how creative some people can be to get around an obstacle.
Seriously, what a bunch of assholes.
So instead of doing the capitalistic thing and gouging with insanely high prices, the shops instead started rationing drives for a sane price so everyone could get a little bit of the very limited supply.
That was actually a really good thing to do. Instead of profiteering, they tried to make the best of a bad situation for everyone.
Then a bunch of dicks like this figure that they're more important than everyone else and that they should be able to get more than enyone else.
Selfish bastards. Nothing but scum.
After reading this I will not be giving them my money.
SJW n. One who posts facts.
Hear the story direct from Backblaze (bonus: goes into more detail).
The real litigious bastards...
I'm confused. Was Costco selling these drives at a loss or something, just to get people in the door?
I can't think of many good reasons that they would look at customers coming in and buying assloads of their merchandise and say "NO! Get out of here and don't buy stuff from us ever again!"
Porquoi?
... is pretty cheap (5$ is for a family account). But as BB itself says, you can only upload 2 to 4 GB per day.
They should be making a mint on that service! They use home-brew storage pods and are very open about it, too!
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/
Anyway, be careful to read all the gotchas:
http://www.backblaze.com/remote-backup-everything.html (hint: 'everything' for a certain definition of everything. No virtual machines, ISO's and NAS storage by default.)
http://www.backblaze.com/internet-backup.html (hint: not all OSes are treated equally.)
(Full disclosure: I work for a storage manufacturer that sells de-duping storage so I think I understand their cost model a bit better than most.)
Karma? What's that again?
A backup in your basement does nothing for you if your house burns down/gets flooded/has a catastrophic power surge/whatever.
Where else can you backup offsite?
--PM
Several months ago I met someone from the Internet Archive (archive.org) who told a similar story. The weren't expanding their storage at the same pace as Backblaze, but they were also resorting to shucking external drives to build their rack mounted servers.
Religion is poison to rationality, and we lose sight of that at our own peril. -- Lurker2288
Yeah. I'll bet they're not even using oxygen-free SATA cables either.
Who cares what they store it on? What's important is it adequately checked for consistency, and what are the backups like. Everything else is detail.
Guess what. Google bought off-the-shelf computer gear for years and some datacentres run things without "datacentre grade" cooling. They don't suffer because a) they do it properly (i.e. not RELY on those drives to never fail) and b) nobody notices because the service is still more than good enough.
"Enterprise"-grade drives are just warrantied for longer. It doesn't mean they won't die just as quickly. Like "RAID"-grade drives - same drive as every other one on the production lines.
It's like saying you can't use Intel Mobile chips in a datacenter. It might not be your first choice, but provided they fulfil all their service obligations (which includes response times, failover, etc.) then who notices and who cares?
Every single server I've ever installed used "consumer grade" drives. Every single desktop I've ever installed used "consumer grade" drives. Failures are actually FAR BELOW any stated MTBF and, who cares, because it takes seconds to replace and DOES NOT AFFECT THE OVERALL SERVICE for the user. And no-one I've worked for has ever lost data because of a drive failure. Ever. Even when servers have all but caught fire.
The whole concept of online file storage makes no sense. Especially for consumes and especially in the U.S. where speeds are slow and costs are high. Getting your data into the "cloud" is extremely slow due to the fact that all ISPs severely restrict upload speeds. Then, once you finally get it all uploaded, getting it back will be difficult, even if you are fortunate enough to live in an area with decent speed, because you are probably one of the many millions of people whose only choice for broadband internet is the local cable monopoly, which means you probably have a monthly bandwidth cap, so good luck downloading all that data that will use up 2 or 3 months of your allowed quota.
Or you could just buy a couple of 2 or 3 TB drives and be done with it.
Any even marginally architected system can deal with disk failures, and indeed *must*. The difference between using masses of consumer disks and a few enterprise disks is that while a failed consumer disk in a massive pool might cause a slight slowdown spread across all of your users, a failed enterprise disk in a business-critical system can ruin your whole day even if you *don't* lose any data. Remember how Google just lets hardware die and replaces it on repeated passes through the datacenter? Same thing.
GStreamer - The only way to stream!
Come try it out!
If you're interested, they open sourced their hardware a few years ago.
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
/me sips his coffee and ponders a new sig...
If your using hardware raid it gets important mostly the time limited error recovery bit to keep raid cards from failing out the drive while it's trying to recover a block, Backblaze is not use hardware raid so it's a non issue. They are a scale wide not deep strategy. When one persons restore can be spread around a dozen servers your limiting factor quickly becomes there internet speed.
No sir I dont like it.
You should read on how they build their systems. One of the ways they keep costs so low is using consumer grade hardware with the idea that it will fail. In general, consumer grade hard drives have about the same failure rate as "enterprise grade", they just usually have lower transfer rates. When your clients are syncing over 768k DSL uploads or even 3-5 Mbps cable upload speeds, hard drive speed is not going to be your bottleneck.
They actually have a guy whose job it is to just go around a day or two a week through their data center and replace the dead drives. Due to the redundancy they built into their systems, a drive failure isn't a big deal or really unexpected.
photography eh?
Nudge, Nudge, wink wink
Say no more
More of a feature than a bug.
They feared that it could be used to suppress protest or support unpopular rule.
I think I had a time limited error recovery bit trying to read that post.
Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
Apparently you believe:
a) they store only one copy of your data
b) that "enterprise" hard drives are some how better quality
Here's a hint: http://lmgtfy.com/?q=google+hard+drive+report
Costco has a corporate policy that limits revenue from sales to 10% above their cost. This 10% covers their overhead costs (buildings,employees, distribution, etc). 100% of Costco's profit comes from their membership fees. Depending on the amount of fuel sold per quarter they may turn a very small profit on this 10% or they might not.
Costco has NO profit incentive to sell one customer more of a product if that means pissing off other customers. Their profits come almost exclusively from membership fees, hence the drive to get everyone signed up for executive memberships.
The intent is not to backup all your porn and movies remotly. It's to backup critical data as well as provide remote access to some files. I strongly believe in a combination of local backups and remote backups. E.g. I backup my project data, programming files and emails to a remote storage. I also have a local backup wich is everything. My remote backup is just over 35gb and is synched and keeps a file version history so I can even go back in time.
I hear stories like this all the time, though they rarely pan out. Granted, it is slightly more likely at a warehouse club where you need a member ID to make a purchase, but it still doesn't seem that likely. I'm no particular fan of costco, but I would love to hear their side of the story.
Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
The difference between an internet search engine and a cloud backup solution is that "good enough" for a search engine is that the results are satisfying enough that I keep returning to that particular search engine. "Good enough" for a backup solution means I can read my data back. Very different situations. Write-only storage isn't terribly useful.
Anyone who thinks enterprise grade drives and consumer grade drives are the same either hasn't ever seen an enterprise drive or they haven't actually compared. Here's a little experiment for you. Get an actual enterprise grade drive and an actual consumer grade drive and place them on a scale. The fact that they don't weigh the same should tell you something about how different they really are. Consumer grade drives user higher density platters, they use different head positioning mechanisms, and they have different motors.
FWIW exactly zero of the servers in the data center I work in have 7200 RPM drives. Exactly zero of the servers in the data center I work in have 1+ TB drives. Exactly zero of the servers in the data center I work in have SATA drives. In our SAN attached arrays we have over three thousand five hundred drives, zero of them are SATA. Zero of them are 7200 RPM. If we were a small business things might be different, but to us our data matters and it being available matters.
It's safe to assume that your snarky lmgfty link goes with your point b? Have you actually read that report, in particular the first paragraph of section 2.2 where they state:
How exactly is a study of consumer-grade disk drives supposed to tell us anything about enterprise drives?
I'm surprised no one mentioned recently started Amazon Glacier service.
They do the same thing - probably more reliably.
The pricing is $0.01 per GB / month. pricing
But there is a 'gotcha': the service is ideal for archival storage and long term backup. It is not just for random cloud storage. Retrieval request takes 3-5 hours to fulfill and if you start downloading/retrieving too much, too often, you pay substantially more.
witold.org
Costco generally limits markup to 15%, not 10%. Also, certain state laws require that Costco apply minimum markups to the selling prices for specific goods, such as tobacco products, alcoholic beverages, and gasoline. Of course, some products are marked down for quick sale. However, the resultant average gross margin target is around 10%.
They do, however, attempt to control their SG&A (overhead) to match their gross margin target of 10%. The net corporate profit is from membership fees which is why they try so hard to get you to sign up for executive memberships...
Google did a study on consumer grade drives a while ago...
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/disk_failures.pdf
And here's how NetApp (one of those "enterprise" guys responded)...
http://storagemojo.com/2007/02/26/netapp-weighs-in-on-disks/
This tidbit known mostly to industry insiders is largely true, especially when comparing comparable drive sizes. But how storage arrays handle the respective drive type failures is what continues to perpetuate the customer perception that more expensive drives should be more reliable. One of the storage industry’s dirty secrets is that most enterprise and consumer drives are made up of largely the same components. However, their external interfaces (FC, SCSI, SAS or SATA) and most importantly their respective firmware design priorities / resulting goals play a huge role in determining enterprise vs. consumer drive behavior in the real world.
Let's assume for the sake of argument that your SAS drives have double the MTBF of a 7200 RPM SATA drive. For simplicity of the arithmetic, let's call the SATA MTBF 1 year, and your SAS enterprise drives' MTBF 2 years. Presumably you don't rely only on this, you also put them in some form of RAID so that you can survive perhaps two drives in any 8 failing, or however you configure your RAID. So you're breaking out the tapes when three drives in a block fail.
The person on SATA pays around a quarter what you do per GB. That means they can effectively RAID 1 your setup and gain an extra drive failure. Their 7200 are slower than your drives but they also have twice as many spindles now so they have an effective speed of over 14,000 RPM.
So, which is more likely, three of eight SAS drives failing before your replacements arrive from HP, or four of eight SATA drives failing before replacements arrive from COSTCO?
Reliability can either be bought through expensive components or through redundancy, and redundancy will beat quality components on pretty much everything except space and power.
I do apologize. There were two other links as well:
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/
A quick search on "consumer" and "enterprise" on those two pages will help. You'd probably like the bit where they discovered that enterprise grade western digital drives fail at a higher rate.
by the way, my original point was that consumer drives perform as just as well. Those are the drives in my nice X4540 box too.
Yeah, it's a cute story but I wouldn't trust them with my backup data.
These people didn't have a risk plan for their business that included hard drive shortages?
Their product isn't worth $10/mo ? (raising price)
Can you imagine an airline that doesn't factor in changes in fuel prices? Or a farm that doesn't have a plan for drought?
Bone for tuna!
I'm a 2000 man.
Yeah, the clue is pretty much in the (now apparently forgotten) original meaning of RAID.
"Inexpensive" disks.
Most online businesses eventually find that 'unlimited' = 'bankrupt once the VC cash runs out'
I use Amazon S3 via Jungledisk (still waitinbg for Google Drive on Linux). I use it in preference to all cheaper options as I understand its profitable therefore will probably still exist if I ever need to do a full reinstall due to some disaster wiping out my onsite copies.
I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
Yeah, and I remember well when Berkeley Uni on their shiny expensive-disks SAN had their whole email system go up in flames for over a week due to a single failed drive and the extra IO hit to recover. When the same thing happens to me with the cheap SATA crap we use, I just move all the masters from that machine to other machines and let it rebuild. No loss of service, minimal impact spread over a large pile of users.
Apart from a really bad batch of 300Gb 10kRPM drives a couple of years ago, it's been very easy. Roughly one failure per month. Systems designed for rapid failover. No worries.
Even in the horrible case where I lost a whole machine and had to rebuild from scratch, only about 5% of users were affected by noticable slowdowns because they were on the source drives for the re-replication, and had to compete for IO. I could have reduced the impact on them by slowing the replication, but that's longer without full redundancy.
(this is all RAID1 as well)
There's more than one way to do it. I care about our users' data plenty, which is why it's on 6 separate live spindles PLUS backup.
http://whatwouldazen.wordpress.com/2012/09/20/the-three-types-of-data/
I laughed. I hated myself for it, but I laughed.
Personally, I use a Subversion repo provided by my web host. I've cut my restart time after a full wipe and reinstall from 2-3 hours down to less than one. My documents and personal file structure is completely contained in the repo, so all I have to do is start the download, and then go install all my programs while my documents recreate themselves locally.
Bits of code, random ramblings: jakimfett.com
I'm not sure who you are disagreeing with, but it's not me. My point was that there is a difference between consumer and enterprise grade drives. Not that enterprise grade is more reliable than consumer grade. The main differences I listed have to do with performance and the ability to get one's data off the drive.
Here are some basic facts about storage for you. A 5400 RPM drive has an average rotational latency of 5.5 milliseconds. A 7200 RPM drive averages 4 milliseconds. A 10K RPM drive, 3 milliseconds. And 15K RPM averages 2 milliseconds. This is just basic math -- any particular block on the drive will sometimes be right before the head and sometimes right behind the head. On average the platter has to spin halfway around to bring that block to the head.
But that's only a portion of the performance of a drive. The other part is how long it takes the head to move from track to track. This is much more design dependent. But in general enterprise drives are expected to have a seek time in the 3-5 millisecond range and consumer drives run 5 and up, typically 5-10 milliseconds.
Add these up and a typical 15K RPM drive will have about 6 ms latency and a typical 7200 RPM drive will have 11 ms latency. Which means that a 15K RPM drive can do approximately 165 random IO operations at it's typical latency (normally measured in terms of 4k, 8k or 16k IOs.) A 7200 RPM drive can do approximately 90 random IOPS. This is a big deal when dealing with multi-user server applications.
Additionally, all SAS and FC drives are dual ported and SAS and FC fabrics are multi-initiator. Which allows them to be deployed in fully redundant and fully active configurations (two paths between server and array, two controllers in the array, two mirrored caches and two paths from each controller to each disk.) A SATA drive has one port. There are port multiplexers that can be inserted between the drive and the chassis, but because the drive itself is natively single ported, only one of the multiplexed ports can be active at a time and thus are limited to having fail-over between the controllers rather than active-active controllers.
As far as RAID performance goes... Two mirrored 7200 RPM drives do not provide the equivalent of a 14.4K RPM drive. Minimum latency is limited by the speed of a single drive regardless of RAID of any type. Here's what a two drive RAID1 gets you: one redundant copy of your data. Twice as many read operations at the same latency as a single disk. And the write performance of a single disk. Because you can do twice as many read operations, you get double the read bandwidth. Yes you can add more drives to your mirror, but there comes a point where the rest of your storage subsystem becomes less redundant that your drives. RAID5 (or RAID6, or RAIDZ, or RAIDZ2, etc.) gets you redundancy to the level of however many disks worth of parity your system implements. For a standard RAID5 that is a single disk failure, for RAID6 it's two disks, etc. Read performance increases as a multiple of the number of drives in your raid group. Write performance is a read and a write of your data block plus a read and write for each parity block. For RAID5 that means that each write will do four IO operations into the raid group. So an eight drive raid group should get double the write performance of a single drive. Of course any array that one would use in an enterprise environment will have at least two battery backed up caches, which makes any write penalties moot as all writes are cached.
As far as reliability goes, that's an interesting question. The fact is hard drives die. However the premium I pay buying hard drives from my storage or server vendor includes 4 hour replacement SLAs in western countries and in less developed areas it's typically 24 hours. I don't know what Costco's policy is, but I'm sure it doesn't involve bringing the HDD to me today and I'd be surprised if I could show up three years later and have them replace my HDD with a matching device. Additionally consumer grade drive
Good story. But has nothing to do with whether enterprise drives are the same as consumer drives.
However if you want to compare penis sizes, how about when a drive fails in either our Hitachi VSP or AMS arrays, they automatically rebuild using a hot spare, phone home and order a replacement, and there is no impact on performance and none of the users are even aware it happened. All this while carrying 40,000+ IOPS, over 99% of which are serviced at 10 ms latency.
I am baffled as to why people think their one -- often special purpose -- use case is definitive for all storage situations the world over.
An Enterprise RAID array isn't strictly about redundancy (although it sounds like that was the point Score Whore was trying to make). It is also about performance. Let's say you are trying to make a 100TB SAN. You can do this using the strategy you outlined, by using 3TB drives and doing a RAID 1 on them. So, 100TB / 3TB = 34 drives * 2 (RAID 1) = 68 drives. Each spindle on a 7200 RPM SATA drive only delivers about 75 IOPs, so that gives you 5100 IOPs Total.
In an Enterprise environment, you are probably going to need a lot more than 5100 IOPs in a 100TB SAN. So, let's say you decide to use 300GB 15k SAS drives. Those give you about 175 IOPs per spindle. If you use the RAID 6 strategy you outlined, which I am fond of myself, (6+2, or 2 failures out of every 8 drives), that would put you around 448 disks total (448 / 8 disks per RAID set = 56 sets * 6 usable drives per set = 336 usable drives * 300 GB = 100800GB). With the 448 spindles, 448 * 175 IOPs = 78400 IOPs. That's a little bit closer to what we're looking for. Throw in a few spares at 30:1 (15 drives), to put you to 463 drives.
How many SATA drives would it take to match the IOPs in a RAID 1 configuration? 78400 IOPs / 75 IOPs per drive = 1046 drives. Spares at around 30:1 means another 35 disks, for 1081 total.
Next we factor power into that. With a Google search, I averaged typical power consumption from 8 7200 RPM 3TB SATA drives (8.6875 W), giving you 9391.188 W for the SATA array. For the 15k 300GB 3.5" SAS drive, it seemed like the most common Google results came back to the Samsung Cheetah, and the data sheet for that one says 7.92 W typical, or 3666.96 W for the SAS array, which means that the SATA array would require 2.5 times the power. More drives, more power means more cooling (and obviously more space as well).
It all depends on what you are trying to accomplish. In an enterprise environment, space, cooling and power are often big concerns. Depending on your environmental limitations and other factors (i.e. regulations, compliance, etc), money isn't always the primary motivator - that all depends on the nature of the business. If you work in a business that is heavily regulated, then you will likely not bet your job on a bunch of SATA disks to store your 5 (or 7, 10, or more) years worth of data that must be searchable, discoverable, highly available, etc (ok, you might bet your job on it, but I'm not going to bet mine on it). Most likely, you are going to tell your company that to protect that data (and potentially your job, depending on your responsibilities), they need to shell out for a costly SAN. Perhaps even 2 geo-redundant SANs that are replicated. Then, you might put a bunch of SATA disks behind that with a backup agent for another layer of protection. Then you might also dump that data to tapes. Which you then ship offsite. Because if things get ugly, you don't want to be the decision maker or recommender who proposed the SATA disks because they were the good enough solution.
Or maybe you do want to be in that position. But I sure don't want to be there. I'm a big fan of well-developed DR/BC plans and highly available infrastructure. When things are working, there are many solutions that can work well. However, when things stop working, you have to have a well-formed plan in mind to recover from the failure. And "we'll just get replacement drives from COSTCO" isn't a particularly well-formed plan (in fact, where I work, even suggesting that would probably result in termination). If you have to wait more than 4 hours to get replacement drives from HP, you should probably look at another storage vendor. Besides, your array should have enough hot spares for the array to rebuild itself even in the event that you don't get those drives in a timely manner.
TL;DR Higher performance disks may be required over cheap disks. It's not always just about redundancy. The same shoe doesn't fit everyone!
Dude, you're giving Godwin's law a bad case of blueballs. I hear you loud and clear. Genocide is good for business. Or they wouldn't be doing it. This view that rationality is whatever people do is tantamount to Godwin's Stargate.
Pretty much the whole of where we need to focus our attention in this messy world is the universal patina in human affairs of lapsed rationality.
The economic premise of "expressed preference" (that the discrepancy between stated goals and actions lies in the verbal blather) was a viable (if narrow and naive) hypothesis so long as the brain remained an inscrutable black box. We have fMRI now. The underpinnings of a consistency of expressed preference are barely detectable under the junk heap of cognitive artefact.
Yet your little sermon remains a popular sermon. I'd be very interested to know what mental glee button it activates in the sermonizer. I'd also be interested to know why my mental "sigh" reflex is more powerful than most. Long ago I was attracted to Brouwer's intuitionism, where reasoning from contradiction is prohibited (this makes the enterprise less brittle to error, but also greatly impedes progress).
Hilbert famously retorted "Taking the Principle of the Excluded Middle from the mathematician ... is the same as ... prohibiting the boxer the use of his fists."
I guess my fists don't give me a hard-on.
Yev from Backblaze here -> since we've always used consumer level gear and actually designed our storage pods around them, we factored higher failure rates and inconsistencies in to the equations. We know that drives fail (this is why we started an online backup company), and we designed around it so that even if we lose drives or entire arrays, it isn't fatal to the storage. You can read the philosophy here (http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/).