Domain: backblaze.com
Stories and comments across the archive that link to backblaze.com.
Comments · 162
-
Re:Why?
-
Re:Averages with how much deviation?
This is the correct answer. We are about to get broadband competition.
BTW: all a network provider has to do to put Netflix's datacenter closer to their customer and improve their score is to call up Netflix and get some of these cool cache boxes modeled after the BackBlaze box. They're FREE.
-
Re:interesting.
Backblazeâ(TM)s pods cost $7,500 for 135TB or $55 per TB.
Letâ(TM)s be generous and say one employee needs to maintain a Pod once per month for one hour (way higher than our server management). Letâ(TM)s assume they consume 1KW of power each. Even if the pod burned out every other year, unlikely but weâ(TM)ll be generous, to Amazon that works out to ($3750) + (12 months*$100/hr) + ($0.10/KWh * 24hr * 365 days) = $5826/12 = $485 per month / 135 TB = $3.59 per TB per month. Thatâ(TM)s assuming a 50% failure rate in HDDs per year.
Letâ(TM)s also not forget bandwidth. Backblaze is looking to upgrade to 10gbps for their datacenter (OC-192). Which runs about $200k per month from what limited data I could find. They currently have 40Petabytes which works out to $200k/40kTB = $5 per month per TB.
So our total is about $9TB per month for Backblaze
vs
$30TB per month for AWS.But thatâ(TM)s just my estimate. You could easily cut almost all of those support time costs. You could very easily get 4 years not 2 years out of a pod and you could probably use green drives that use less than 200watts cutting your power by 80%. In other words $9TB is generous and assuming that they actually need 10gbps right now.
Backblaze's official numbers are $100k per 3 years for 1PB.
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/$100k / 36 months / 1,000TB = $2.70 per TB per month.
Most people probably have $1TB of backup needs.
-
Re:All on consumer grade drives.....
I do apologize. There were two other links as well:
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/A quick search on "consumer" and "enterprise" on those two pages will help. You'd probably like the bit where they discovered that enterprise grade western digital drives fail at a higher rate.
-
Re:All on consumer grade drives.....
I do apologize. There were two other links as well:
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/A quick search on "consumer" and "enterprise" on those two pages will help. You'd probably like the bit where they discovered that enterprise grade western digital drives fail at a higher rate.
-
They Open Sourced their Hardware
If you're interested, they open sourced their hardware a few years ago.
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
-
3.96$ a month...
... is pretty cheap (5$ is for a family account). But as BB itself says, you can only upload 2 to 4 GB per day.
They should be making a mint on that service! They use home-brew storage pods and are very open about it, too!
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/Anyway, be careful to read all the gotchas:
http://www.backblaze.com/remote-backup-everything.html (hint: 'everything' for a certain definition of everything. No virtual machines, ISO's and NAS storage by default.)
http://www.backblaze.com/internet-backup.html (hint: not all OSes are treated equally.)(Full disclosure: I work for a storage manufacturer that sells de-duping storage so I think I understand their cost model a bit better than most.)
-
3.96$ a month...
... is pretty cheap (5$ is for a family account). But as BB itself says, you can only upload 2 to 4 GB per day.
They should be making a mint on that service! They use home-brew storage pods and are very open about it, too!
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/Anyway, be careful to read all the gotchas:
http://www.backblaze.com/remote-backup-everything.html (hint: 'everything' for a certain definition of everything. No virtual machines, ISO's and NAS storage by default.)
http://www.backblaze.com/internet-backup.html (hint: not all OSes are treated equally.)(Full disclosure: I work for a storage manufacturer that sells de-duping storage so I think I understand their cost model a bit better than most.)
-
3.96$ a month...
... is pretty cheap (5$ is for a family account). But as BB itself says, you can only upload 2 to 4 GB per day.
They should be making a mint on that service! They use home-brew storage pods and are very open about it, too!
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/Anyway, be careful to read all the gotchas:
http://www.backblaze.com/remote-backup-everything.html (hint: 'everything' for a certain definition of everything. No virtual machines, ISO's and NAS storage by default.)
http://www.backblaze.com/internet-backup.html (hint: not all OSes are treated equally.)(Full disclosure: I work for a storage manufacturer that sells de-duping storage so I think I understand their cost model a bit better than most.)
-
Skip the blogspam
Hear the story direct from Backblaze (bonus: goes into more detail).
-
You can't afford $4 per month?
I use backblaze -- $47.50/year for a two year term and unlimited storage. For the mathematically challenged among us that's $3.96/month. Skip a couple cups of coffee a month and sleep better in more ways than one. As a bonus they show you how to build one of their 135TB storage pods here http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ .
-
Backblaze
I use backblaze http://www.backblaze.com/ for off-site backups. $50/year for unlimited storage is more than reasonable. I currently have about 2.5TB backed up there.
-
BackBlaze
A cloud backup service released information on how they build their own disk based backup servers. Maybe something that would help with your endeavor? http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/
-
Re:Lots more than just CPU and transfer resistors.
3: Backups.
I have a Time Capsule and use Backblaze. I'm set in that department (I'd like to think).
-
Re:Why?
My Sun x4540 box laughs at you. And if you believe NetApps and EMC use RAID "cards", well, I guess, pass the pipe. A better computer does not mean the most expensive computer. It just means you don't buy the cheapest thing from walmart.
And your motherboards using FakeRaid? And you are using that as an argument? Shows how much enterprise shit you get to play with.
Anyone building a serious enterprise storage capability knows better than to waste money on raid cards.
You think Google, Facebook, and those cloud providers use RAID cards? Seriously?
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ sure don't show any RAID cards in there.
-
Re:old timers look here
If the temps are in "operating ranges" which run higher than you might think (check with the hard drive manufacturer for specs), temperature doesn't correlate to drive failure:
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/#more-337
Look for the "lessons learned" section in that link. -
Re:Tape never died or lost its supremacy
A backblaze box. 1PB for about $55k.
ZZZZZAAAP.
That was the lightning strike that wiped out your $55K cheap solution where you're storing the data SOX requires you to keep.
Ooops.
Now you get to explain to the execs who now risk jail time why you were SOOOO fucking smart.
Sometimes it really is about covering your ass with the legally-acceptable conservative approach.
Nevermind all the money you wasted paying to keep those disks spinning....
Know how much electricity 50 or 100 petabytes of tape use?
None.
-
Re:Tape never died or lost its supremacy
A backblaze box. 1PB for about $55k.
-
Re:Keep a spare blank drive around
Link contains a referral ID, so Shikaku is earning from this, but not willing to say so.
Eventually, it ends up at http://www.backblaze.com/
-
How small is your basement?
Internet Archive's last published generation Petabox (now more than a year old, so they were using smaller drives), would take two racks
... which is still reasonable, but you could probably fit it in a single rack with today's drives. A Backblaze Pod is 42 disks in 4U, so you could do it yourself and assuming you can get enough large disks after that whole flooding thing, be able to get a TB in a single rack easily. The Sun Thumper took 48 disks in 4U ... I don't know if the X4540 ever supported larger than 1TB disks, though.My department just got a Nexsan E60 in yesterday
... 60 3TB disks in 4U, so you can squeeze 1.8PB raw in a 42U rack. (usable space ... still more than a PB, even with spares.)So, space isn't the issue
... power and cooling way be, though. -
Just cuz they have the coolest looking equipment..
http://www.backblaze.com/ - I always wanted to build one of those 200TB units for $8k too
-
Re:It'd better happen quick then
Do you have failure rates for spining drives? A little research might just show you that they are basically on par with the numbers you are quoting.
http://www.pcworld.com/article/131168/harddrive_failures_surprisingly_frequent.html
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/
"All told, Sean replaces approximately 10 drives per week, indicating a 5 percent per year drive failure rate across the entire fleet, which includes infant mortality and also the higher failure rates of previous drives. (We are currently seeing failures in less than 1 percent of the Hitachi Deskstar 5K3000 HDS5C3030ALA630 drives that we’re installing in pod 2.0.)"For reference, when comparing A and B you should remember to actually quote the data for both A and B and not just one of the other as your proof that one is better than the other.
-
Re:Who uses tape any more?
There's no need for a robot when all the "tapes" (aka HDDs) are all accessible while stored: http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/
(IIRC this was on Slashdot before).But if an org only needs 25 or similar magnitude of drives you can go Dell (e.g. stuff like MD1000 or MD1220) for a many times the price/capacity. Most orgs with those needs can afford it.
If I were running my own company I'd go with Dell/etc and HDDs first, but if my required storage capacity curve goes up steeply, I'd do the backblaze thing.
Not tape. I don't trust tape, in my experience tapes fail way more than hard drives. Both the tapes and tape drives wear out quite fast - there is physical contact between the heads and the tape, and the drives make many passes per backup cycle. So you'll actually need a higher number of tapes compared to HDDs which are mostly fine spinning at thousands of rpms for years.
Maybe I've been unfortunate? So far the HDDs seem to do better for: total number of HDDs, number of failures per year vs total number of tapes (or tape drives) and failures per year.
-
Re:I know this isn't what you asked but...
Yeah, if it were me, instead of a RAID10 with exotic hardware, I'd split it across a few cheap servers and run software RAID6 for a more hardware-agnostic approach. Then use something like OpenAFS (which I unfortunately have 0 actual experience with) to make those servers look like one filesystem to clients. That should get you a good bang for the buck, since motherboards and tower chassis that can fit 6 disks and gigabit networking hardware is relatively cheap compared to JBODs and junk.
Lustre and OCFS2 are more suited for homogenous cluster performance, so accessing the data wouldn't be very convenient. At least with OpenAFS you can run clients on Windows and OSX as well. With the cluster FS's I don't think it's even safe to run different kernels on the nodes.
I've read unflattering things about the performance of GlusterFS, even if you do have exotic multi-homed SAN fabrics to run it on. Never heard of subby's last option. I had also tried to get CODA working for the longest time, but it still seems too complicated and experimental compared to *AFS.
On my own home system, for most of the past 8 years I've been running a hybrid software RAID of 4x 250GB disks, with one set of partitions in RAID0 for
/tmp , one set in RAID10 for performance, and one set in RAID5 for maximum storage. (And my important dirs rsync'd offsite to a friend's server which I donated hardware towards). This setup has survived about 2 disk failures over the years. The oldest file in my home dir goes back to 1998 or so.But if he really insists on lots of on-line storage, check out this custom box linked from slashdot a few months back:
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ -
Petabytes on a budget: 67 TB 1U Server for <$8kSee here: Backblaze Blog: Petabytes on a budget.
They use JFS on debian.. You can easily add the filesever software of your choice (Samba, Netatalk, NFS, etc.)
On the hardware side they use a Intel mainboard with a Intel Core 2 CPU, a PCIe SATA 2 Controller and 45 SATA 2 discs (each 1.5 TB). They put it in a custom enclosure, the 3D model is available here (25 MB ZIP archive). This all costs less than 8000€ for 67 TB (discs included!).
There is also an update, where they get 135 TB for less than $8000. In this model they still use Debian, but as a filesystem they run ext4 on LVM with RAID 6.
-
Petabytes on a budget: 67 TB 1U Server for <$8kSee here: Backblaze Blog: Petabytes on a budget.
They use JFS on debian.. You can easily add the filesever software of your choice (Samba, Netatalk, NFS, etc.)
On the hardware side they use a Intel mainboard with a Intel Core 2 CPU, a PCIe SATA 2 Controller and 45 SATA 2 discs (each 1.5 TB). They put it in a custom enclosure, the 3D model is available here (25 MB ZIP archive). This all costs less than 8000€ for 67 TB (discs included!).
There is also an update, where they get 135 TB for less than $8000. In this model they still use Debian, but as a filesystem they run ext4 on LVM with RAID 6.
-
Petabytes on a budget: 67 TB 1U Server for <$8kSee here: Backblaze Blog: Petabytes on a budget.
They use JFS on debian.. You can easily add the filesever software of your choice (Samba, Netatalk, NFS, etc.)
On the hardware side they use a Intel mainboard with a Intel Core 2 CPU, a PCIe SATA 2 Controller and 45 SATA 2 discs (each 1.5 TB). They put it in a custom enclosure, the 3D model is available here (25 MB ZIP archive). This all costs less than 8000€ for 67 TB (discs included!).
There is also an update, where they get 135 TB for less than $8000. In this model they still use Debian, but as a filesystem they run ext4 on LVM with RAID 6.
-
Use a combination of methods
You should use a combination of methods and be prepared to move your backups to a new place every few years.
For off-site backups I use Backblaze which is just $3.96 a month unlimited storage if you buy two years (these are the guys that build a half a petabyte custom servers). I also backup to a removable drive every once in a while that I keep in the office.
Maybe once a year, I pick the very best pictures and print them and build an album. Even cheap photo paper lasts at least 50 years. Archival quality lasts over 100 years if stored properly. -
backblaze
-
Re:Constant failures?
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/
Backblaze provides some metrics about their drive failure rates. It's surprisingly low (1-5% per year). If they had 200k drives, they would need to replace 39-192 per week. I'm sure the cluster is built with lots of redundancy that doesn't require a person to immediately replace a failed drive. They'll probably need a full time staff of at least 3 to maintain it.
-
Re:The AES-128 "crack" requires 2^88 bytes of stor
135TB in a 4U Blackblaze storage pod, 280 rack units in a 20' x 8' [
... x 8' high? ] shipping container, gives 9.5PB or log2(135 * 8 * 10^12 * 280 / 4) 2^56 bits of raw online storage.So now you *only* need 4 billion (2^32) shipping containers... yeah right. Stacking them 8 high, with no space for walkways or roads, would cover an area at least 55 miles on each side.
-
Re:Again?
i wish they sold or someone sold the setup sans drives (or just the bare case)
TFA says the case is available from Protocase for $875 in single unit quantities.
A "pod" is just a standard x86 PC in this custom 4U case. Sure, it has a few specific extras, but all are standard, off-the-shelf hardware that you can easily buy. Appendix A in the Backblaze blog post gives every detail you need.
If you start with just 15 hard drives (for a total of 45TB), then the price would be about $3300. You probably only save about $500 by using an standard case, because a decent one with room for 15 or more drives will set you back at least $300.
-
Re:Lots of goofy options
Backblaze, which runs 16PB of disk storage has tested drives, here is their recommendation:
We are constantly looking at new hard drives, evaluating them for reliability and power consumption. The Hitachi 3TB drive (Hitachi Deskstar 5K3000 HDS5C3030ALA630) is our current favorite for both its low power demand and astounding reliability. The Western Digital and Seagate equivalents we tested saw much higher rates of popping out of RAID arrays and drive failure. Even the Western Digital Enterprise Hard Drives had the same high failure rates. The Hitachi drives, on the other hand, perform wonderfully.
Newegg has them for $130 http://www.newegg.com/Product/Product.aspx?Item=N82E16822145490R
-
Re:7K for software raid? and why a low end cpu?
No. Hardware controllers are the right solution in this context. These pods are not designed for individual users, but for corporations that can afford stockpiles of spare parts, so replacing a board can be done easily. Using hardware controllers allows many more drives per box, and thus per CPU. A populated 6-CPU motherboard is going to be less reliable, dissipate more heat, require more memory, and likely be less reliable, than the special-purpose hardware approach that allows for a single CPU.
Software RAID makes sense when you have a balance of storage bandwidth requirements to CPU capacity that is heavy on the CPU side. This box is designed for the opposite scenario, as the highly informative blog describes:
http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/
(Yes, I know, expecting someone to read the blog would mean that they would have to read the linked article and then click through to the original post, a veritable impossibility. Still, it is recommended reading, especially the part about their experience with failure rates and how they have *one* guy replacing failed drives *one* day per week.)
-
Re:The price is too high..
You might want to read the actual blog where they explain what they use in a bit of detail. This isn't my area of expertise either, but I do know that running 10 servers is very different from running 100 servers, which is also different from running 1000 servers. There are many questions that crop up that you really don't have to consider when you're down in the smaller arenas. (E.g. patch management - manually patching 10 servers is feasible and more cost effective than having an OTS solution; manually patching 1000 servers, not so much.)
They do also state at the outset:
In this post, we'll share how to make a 2.0 storage pod, and you're welcome to use the design. We'll also share some of our secrets from the last three years of deploying more than 16 petabytes worth of Backblaze storage pods. As before, our hope is that others can benefit from this information and help us refine the pods.
My reading - they definitely know more about this than I do, and they're not too proud to admit there could be lessons they can learn from the community.
-
Re:Feelin' HOT HOT HOT
Something about all those drives being packed in there like hot metal sardines gives me a bad feeling...
apparently it is not an issue as their blogpost says:
We monitor the temperature of every drive in our datacenter through the standard SMART interface, and we’ve observed in the past three years that: 1) hard drives in pods in the top of racks run three degrees warmer on average than pods in the lower shelves; 2) drives in the center of the pod run five degrees warmer than those on the perimeter; 3) pods do not need all six fans—the drives maintain the recommended operating temperature with as few as two fans; and 4) heat doesn’t correlate with drive failure (at least in the ranges seen in storage pods).
-
Original blog post
Here is a link to Backblaze's actual blog entry for the new pods 135TB, and here is the original 67TB pods. The blog article is actually quite fascinating. Apparently they are employee owned, use entirely off-the-shelf parts (except for the case, looks like), and recommend Hitachi drives (Deskstar 5K3000 HDS5C3030ALA630) as having the lowest failure rate of any manufacturer (less than 1% they say).
I found it kinda amusing that ext4's 16TB volume limit was an "issue" for them. Not because its surprising, but because... well, its 16TB. The whole blog post is actually recommended reading for anyone looking to build their own data pods like this. It really does a good job showing their personal experience in the field and problems/not problems they have. For instance: apparently heat isn't an issue, as 2 fans are able to keep an entire pod within the recommended temperature (although they actually use 6). It'll be interesting to see what happens as some of their pods get older, as I suspect that their failure rate will get pretty high fairly soon (their oldest drives are currently 4 years old, I expect when they hit 5-6 years failures will start becoming much more common.) All in all, pretty cool. Oh, and it shows how much Amazon/ Dell price gouges, but that shouldn't really shock anyone. Except the amount. A petabyte for three years is $94,000 with Backblaze, and $2,466,000 with Amazon.
P.S. I suspect they use ext4 over ZFS because ZFS, despite the built in data checks, isn't mature enough for them yet. They mention they used to use JFS before switching to ext4, so I suspect they have done some pretty extensive checking on this.
-
Original blog post
Here is a link to Backblaze's actual blog entry for the new pods 135TB, and here is the original 67TB pods. The blog article is actually quite fascinating. Apparently they are employee owned, use entirely off-the-shelf parts (except for the case, looks like), and recommend Hitachi drives (Deskstar 5K3000 HDS5C3030ALA630) as having the lowest failure rate of any manufacturer (less than 1% they say).
I found it kinda amusing that ext4's 16TB volume limit was an "issue" for them. Not because its surprising, but because... well, its 16TB. The whole blog post is actually recommended reading for anyone looking to build their own data pods like this. It really does a good job showing their personal experience in the field and problems/not problems they have. For instance: apparently heat isn't an issue, as 2 fans are able to keep an entire pod within the recommended temperature (although they actually use 6). It'll be interesting to see what happens as some of their pods get older, as I suspect that their failure rate will get pretty high fairly soon (their oldest drives are currently 4 years old, I expect when they hit 5-6 years failures will start becoming much more common.) All in all, pretty cool. Oh, and it shows how much Amazon/ Dell price gouges, but that shouldn't really shock anyone. Except the amount. A petabyte for three years is $94,000 with Backblaze, and $2,466,000 with Amazon.
P.S. I suspect they use ext4 over ZFS because ZFS, despite the built in data checks, isn't mature enough for them yet. They mention they used to use JFS before switching to ext4, so I suspect they have done some pretty extensive checking on this.
-
Re:For those that are confused
Using custom hardware, they could store about 120TB for ~$8,000.
I base that on this article, assuming that they use 3 GB drives instead of the 1.5 they used a few years ago.
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/Lets says they put two of these in an ISP, thats 240TB. Netflix streams at about 2GB/hour. That means they could store 120,000 hours of content for ~$16K per ISP. That's not their whole library by far, but I would be willing to bet that's enough to store the top 95% of requested media at least.
-
Re:Dangerous
You may remember them from this post on their blog where they provide a detailed description of their technical setup. Their services are excellent, IMO.
-
Re:Dangerous
You may remember them from this post on their blog where they provide a detailed description of their technical setup. Their services are excellent, IMO.
-
Re:lack of information
Indeed. The critical thing is almost certainly the back ups and network connection. They've presumably already go the software for doing their jobs picked out and everybody knows how to use it, at least partially.
However, it's almost certainly the case that they haven't gotten their backup system in order and finalized the network.
Asking them what they want should guide things along the way. It might be acceptable to use a service like backblaze to handle the back up process or more likely they'll need to keep it in house for reasons related to regulatory requirements. Without knowing more information it's hard to know what sort of advice to give. -
Re:Short term CD-R
In the modern era with large disks, it makes a lot of sense to keep backups in the cloud, with a second copy that you personally maintain. I'm using this article as an excuse to make backups of all my important discs. And to verify that the backups I do have are properly maintained. So far I have yet to find a bad disc, and most of them are at least 7 years old on no name discs. I'm going to be keeping a spare disc and storing the disc image on a local filesystem which gets backed up to the cloud. Backblaze at the present.
The advantage to doing it that way is that when a new storage medium comes out it's relatively easy to migrate up. And in the meantime, I can use something like PAR or dvdisaster to veryify that the discs haven't gone bad and recover them as needed. -
What do YOU recomend?
The discussion thus far: civvies use CD-R's and businesses use tape.
...HOWEVER...
As a new professional to the field, I am unsure what I should be recommending to my family and friends. CD's and even DVD's aren't bad options, but their size becomes problematic when storing volumes of family photographs and video, in addition to the personal detritus of an online presence: funny photos, music, recipes, chat logs, etc. Tape is noted for its capacity, and longevity under the correct circumstances, but is expensive and susceptible to the same troubles as cassettes. I have also used active hard drives, but have found trying to keep data long-term on a spinning disk is just begging for a head-crash. Flash media is expensive, of limited size, and untested in long-term storage (I have lost most of my data stored on early flash drives).
So, what do I recommend to my family and friends? Should I continue to recommend quality CD's, DVD's, and correct storage procedures? Should I set up a http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/ (with a RAID setup) like service for them and be prepared to transfer files to a new system every 7-10 years? What do I do about changing file types? -
And if you want a big SSD
And if you do need a big SSD Kingston has had a laptop 512GB SSD out since May with huge performance, and this month Toshiba and Samsung will both step up to compete and bring the price down. We're getting close to retiring mechanical media in the first tier. Intel's research shows failure rates of SSD at 10% that of mechanical media. Google will probably have a whitepaper out in the next six months on this issue too.
This is essential because for server consolidation and VDI the storage bottleneck has become an impassable gate with spinning media. These SSDs are being used in shared storage devices (SANs) to deliver the IOPs required to solve this problem. Because incumbent vendors make millions from each of their racks-of-disks SANs, they're not about to migrate to inexpensive SSD, so you'll see SAN products from startups take the field here. The surest way to get your startup bought by an old-school SAN vendor for $Billions is to put a custom derivative of OpenFiler on a dense rack of these SSDs and dish it up as block storage over the user's choice of FC, iSCSI or Infiniband as well as NFS and SAMBA file based storage. To get the best bang for the buck, adapt the BackBlaze box for SFF SSD drives. Remember to architect for differences in drive bandwidths or you'll build in bottlenecks that will be hard to overcome later and drive business to your competitors with more forethought. Hint: When you're striping in a Commit-on-Write log-based storage architecture it's OK to oversubscribe individual drive bandwiths in your fanout to a certain multiple because the blocking issue is latency, not bandwidth. For extra credit, implement deduplication and back the SSD storage with supercapacitors and/or an immense battery powered write cache RAM for nearly instantaneous reliable write commits.
I should probably file for a patent on that, but I won't. If you want to then let me suggest "aggregation of common architectures to create synergistic fusion catalysts for progress" as a working title.
That leaves the network bandwidth problem to solve, but I guess I can leave that for another post.
-
Here is 67 Terabytes for $7867
Do it yourself. Get almost a petabyte for $7867:
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
My answer:
67 terabytes is 67000 Gigabytes.
$7867 / 67000 = 11 cents per gigabyte.
Your mileage may vary. -
Re:Intentional?
Not really. Google designs and builds their own servers.
The "super expensive storage solutions" are for suckers.
http://news.cnet.com/8301-1001_3-10209580-92.htmlThese expensive solutions are probably the reason why the analyst mentions saving $1M for each 100TB removed.
With 4U enclosures like backblaze's, you get 90TB for $11K of hardware and $6K (45 disks @ 8WH) of power usage per year.
An IT operator can control dozens of such enclosures, let's say a conservative 2 dozens. So $160K salary / 24 enclosures is $7K.
Add $7K for a full time dev and custom storage management software, and $14K for management (still for 24 enclosures).
That's still about $45K for 90TB all included, exactly 20 times less than the mentioned $1M for 100TB.http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
I replaced the 1.5TB disks with Seagate Barracuda XT SATA 6Gb/s 2TB disks at $200 on newegg in this computation.
Seagate's other models built in China have lots of problems that the XT doesn't seem to have. -
Re:Something like this
Do something like this. Put it in a case / box / cabinet of your own design since you don't need the rackmount capability.
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
While that is gorgeous, if he's maxed out his boxes at ten 3TB drives each already, replacing his current setup doesnt gain him much (7TB) and costs an extra $3000 over the cost of two file servers and 20 3TB drives.
Supplementing his current setup that way would also be more expensive. It would be about $4800 to add two more machines as file servers and dump ten more drives in them each - saves roughly $3000 while only being 7TB less space.
Dont get me wrong... if I had the money, I'd go the route you suggest. Far nicer looking and far more elegant.
-
Re:Something like this
Do something like this. Put it in a case / box / cabinet of your own design since you don't need the rackmount capability.
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
FTFA:
The small stack of six pods in the rack I’m working on contains just under half a petabyte of storage.
... But obviously the one in the picture is at least seven pods. -
Re:Something like this
Do something like this. Put it in a case / box / cabinet of your own design since you don't need the rackmount capability.
http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/
If possible use something like ZFS (or btrfs if you feel confident about it) so that you get checksumming data protection.
If you're going to put all your eggs in one basket, you better watch that basket very carefully.
The creators of that kit don't use any kind of redundancy with-in the box because their custom software stack handles replication (kind of like Google FS / Hadoop FS).