The Amazing $5k Terabyte Array
An anonymous reader writes: "Running out of space on your local disk? How about a Terabyte array for only a few thousand dollars. This article at KCGeek.com shows how to put together 1000 Gigs of hard drive space for the cost of a few desktop computers."
I could rip my entire anime collection for instant access! Rip all my
CDs and still have .9 Terabytes left! Maybe Mirror Usenet! I guess
the simple truth is that now that 100 gig drives are a couple hundred
bucks, we now have the ability to store anything we reasonably could
need (unless you define "Reasonable" as "I need to store DNA Sequences").
yeah , with 160 gig ATA drives out now,
you can do it with 6 drives vs. 10 drives,
and alot of motherboards come with onboard
RAID, and if you use software RAID via
win2k or Volume manager type app for Linux
it would rock .
Cheap too, at $260 per drive per pricewatch .
Peace out...
Actually a DNA sequence is only about 3GB for a human - you're anime DVDs might take more space, at least until you compress them. Then again, DNA should be fairly trivial to compress highly. Let Z = CA, Y = TG, .....
"Computer Science is no more about computers than astronomy is about telescopes."
-E. W. Dijkstra
A terabyte isn't any thing special. But it's cool to see someone doing it. I was bored once one night. For a mere 36K you could, assuming you already own a Thunder K7 w/ the on-board SCSI pluss needed components, put together your self some really big storage. Using those 181GB Seagate SCSI drives.
;p
U160 and all of it churning at 10,000RPM. For a grand total of a few GB short of 5.5 Terabytes.
But assuming you can affoard Thirty 1200$ drives you should be able to spring for a nice U160 SCSI RAID Card with an external connector
I couldn't even find a case with enough room for 30 hd's.... and I don't want to even think about cooling.
But I wont have to worry about that. I can't even affoard a 9gb scsi drive at this point.
Computational Madness in a round package.
I've been using these for a long time (6200 dual-port in hardware-mirror, up to the 8-port cards for large disk configs), and they're very fast and reliable. Cheap, too.
$500 for an 8-port 64-bit RAID controller, looking to the host like a single scsi device per logical volume, seems like the best deal available. Along with a motherboard with sufficient slots for gig-e and these cards (easy to get 4 64-bit slots...maybe you can get more with 3-4 buses), and a 4U rackmount case with 16 drive bays, and you can have 4U of rackmount storage for $5k, too.
I've been using setups like this for clients, as well as for private file storage (divx, mp3, backups, etc.), and know of people using them for USENET news servers (one of the most demanding unix apps for reasonably priced hardware).
It goes without saying you want a journaled file system or softupdates when you have disks this size, and ideally keep them mounted read-only, and divided into smaller partitions, whenever possible. e2fsck on a 300GB partition with hundred of open files is painful.
I'm sure some poor fool will do something like this, fill it up with data, then have ONE hard drive go bad, making everything practically useless.
What we need isn't larger hard drive storage (not that it's a bad thing) we need more speed, and a cheap, gigantic & ultrafast tape backup system to backup all the data. Some PC designs that use better cooling methods would be very nice as well.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Inspired by Slashdot's earlier story that was nearly identical, and with the help of Peter Ashford from ACCS, we built two servers, both with capacities well over a TB, for around $8000 each. They have the capacity to expand to 3TB if need be.
Story here
As far as performance:
(from my memory)
EXT3: About 16MB/Sec block write, 45MB/sec block read
ReiserFS: About 20MB/sec block write, 130MB/Sec block read (that's no typo).
XFS: About 30MB/sec block write, 85MB/sec block read.
It seems that file system plays a large role in performance. The arrays are three RAID5 in hardware using Linux software RAID0 on top of the RAID5 arrays to tie them together.
IDE RAID controllers are 3ware Escalade 7810. Write performance can be greatly increased by using 7850 cards that have more cache.
We stuck with XFS, Reiserfs had a bigfile bug, files created over 2GB would lock up the computer basically. XFS in general seemed much more mature, reiserfs seems more like someone's college thesis project, that they never cleaned up to be production grade.
We experimented with different RAID0 stripe sizes, the hardware RAID5 stripe size is fixed at 64k, there are 7 active disks in each array and one hot spare. Stripe size tweaking seemed to mostly trade off read for write speed, within a certain range of values, with a taper off in performance at either extreme, (down around 8k stripes, or over 1024k stripes)
We eventually went with 1024k stripes. That is what the benchmarks above reflect. The variance in file system performance could very well be due to interactions with stripe size, but there seemed to be common themes (reiser always read fastest no matter what stripe, XFS was always better at writes)
I have been in so many arguments with SCSI zealots on here over this RAID... I wish people would understand what price/performance ratio means. IDE isn't a superior technology, but every now and then, it is the right tool for the job, when price is a goal too.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
Is this any more special than the last time
.. 10 of these would give you over 1 terabyte in useable space in raid 1.. Or if you just cared about write performance, 6 of them for $1554 would give you a terabyte of useable storage.. another $600 to throw together a cheap pc and cheap ide raid cards.. you get it for under $2500.. big deal.
slashdot announced an amazing terabyte arrayHere
Seriously though.. People's numbers are pretty far off. This can be done for about 3000.. Pricewatch
has 160 gig drives for $259
Lately I'm realizing how awful IDE really is.. I finally got around to throwing 2 36 gig ultra 160 drives on my box with an adaptec scsi card, running ext3 on top of a raid mirror.. more space than I need (I just keep all my mp3s on an IDE raid.. since my dragon motherboard has ide raid built in).. Since I've gone to scsi life has been happy. I can do things while compiling, while vacuuming my db, etc..
Funny how mac used scsi before the rest of us, huh?
"And how can this be? For he is the
Aren't these types of systems more for archiving massive amounts of data than actively working on it? I mean, how much data can a computer actively process anyway? Wouldn't a 100GB drive meet just about any processing demands (genome tracking, video editing, etc)?
Why not use slower but MUCH cheaper offline storage? I really like the design goal of
http://www.dvdchanger.com/
You can easily get 1TB of storage with such a device for less than $1000. True, only one person can access it at a time but that is only because PowerFile wants to charge more for so-called "networked version".
In theory, if someone could figure out how to build on of these things, you could throw in a two or three CD/DVD drives for accessing and a 20GB hard drive to buffer images. Boom. Now you have the perfect storage backbone for a house-wide media center. I just wish Linksys or someone would throw a linux thinserver onto of the PowerFile hardware and get me something cheap and network-ready.
- JoeShmoe
.
-- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
Does anyone out there actually use IDE drives like this? It seems a pretty obvious thing to do.
Paul.
You are lost in a twisty maze of little standards, all different.
Video is the most bulky storage people would save. How much would people want to save for re-viewing? First you have the time-shifting stuff like TiVo/Replay- perhaps a few tens of hours at most. Then you would be your favorite movies and TV series. As video-phone improves you might be saving some hours of friends and relatives video conversations. With infinite storage, the constraint becomes need and time to view all that stuff. And you'll probably be wanting to spend your time looking at new stuff. So I'd guess most people's real needs would be hundreds to a thousand hours. At 1-2 BG per hour, your talking about a terabyte or two.
I don't include the argument that you'd have trouble finding old stuff. Computer software is more clever at organizing things - far better than material storage. A good recent example of this is Apple's "iPhoto" that much more convenient for organizing thousands of photos than physical albums.
At the most people's genomes differ by 0.1% from each other - much less than that if you are relatives. Therefore you'd record the differences, sort of like several of mpeg algorithms.
I had a similar problem when I bought a house last year. I had a converted garage that I wired for ethernet, and even ran ethernet into the basement. However, I didn't want to install ethernet jacks in the house, as it's about 100 years old, and I didn't want the hassle.
I settled on using 802.11b wireless to communicate between the house and the office. I know all about the security problems (my address is....) but maybe the newer 802.11g or 802.11a might work for you.
I have some workbenches in the basement that are about 4-5 feet off the floor. I'm going to install a file server and leave it on one of these benches.
It's cold and damp down there in the winter. I don't know how well the equipment will take to the humidity. I guess I'll find out!
With Raid5 a single drive can fail without causing dataloss.
How do you know WHEN a drive has failed?
With the low end IDE RAID cards your notification comes when the 2nd drive fails......
3Ware's website describes a SNMP monitoring utility for windows, but didn't specifically mention Linux support. Ditto for Adaptec.
If the raid is done in software, is there a linux program to monitor and notify when a single drive goes down?
I've wanted a terabyte of storage since the mid-1970s, when I realized that there were approximately a trillion square meters on the Earth's surface. Store one byte of grayscale image for each square meter and that's a terabyte of data right there.
Of course these days I'd want 3TB so I could store color images.
We (my company) designed a very similar system using a Tyan Tiger200 with dual GHz Cel's etc. The problem is that the drives he lists (the 160GB Maxtors) aren't addressable by the RAID controller he is using (the Promise TX). The Promise card will only address up to 127GB per drive. You have to use a ATA-133 spec controller to get the full capacity out of those drives. We did an array using the TX and WD 120GB 7200RPM drives (with 8MB cache - mmmmmmmmm.....) that flat smokes anything that you can put together with the Maxtor drives. Oh well....
> Its only a matter of time 'til video becomes as
... You'd be better off creating a tool to iterate over a set of files for you.
:)
...
:-)
> commonplace as MP3's on our drives. 100 Gigs is
> I don't see my appetite for
> disk space slowing down any time soon.
True enough. I disagree with Cmdr Taco's comment:
"we now have the ability to store anything we reasonably could need"
I used to say the same thing a while back, thinking I could never fill a disk. That was a 5M Sider drive for an Apple II...
I just wish the stupid BIOS and drive manufacturers would get their act together on drive limits...
Nobody will ever need more than 500M...
Nobody will ever need more than 2G...
Nobody will ever need more than 8G...
Nobody will ever need more than 32G...
How many times can you shoot yourself in the same foot with the same gun?
> logfiles that don't roll over - ever; online
That is a terrible architecture for storing log files... Makes them very hard to search, modify,
> network backup... I'm sure to figure out a way
> to fill that terabyte.
No problem there.
A terabyte just isn't that much when you start to think of volumetric data, CFD, physics calculations, FEA,
Personally, I'd really like to stop seeing all of this spinning media and start seeing solid state stuff with much higher densities...
Frustrates me seeing people talk about 500 terabytes in a test tube. Forget that, just get the stuff working and tell me where to place my order for something I can use.
Storage solution: 1TB RAID5 storage array (Prices are from Pricewatch) Quantity Price Subtotal Intel Celeron 700 MHz w/ Socket 370 MB, UDMA 100, AGP VIDEO 8~64MB shared only, Sound, 56K AMR Modem, 10/100 Network in MidTower case w/Powersupply 1x$135.00=$135.00 Power Magic PCI IDE U/ATA100 RAID Controller w/Cable 4x$22.00=$88.00 Maxtor 4G160J8 5400/133 8x$259.00=$2,072.00 60.0GB EIDE Ultra DMA 5400 1x$85.00=$85.00 Total: $2,380.00 - Mangoless
[a mango-free monkey]
Get a 3ware escalade card in march they'll support 48bits-LBA in the new firmware, you'll be able to hookup those 160GB monsters in raid-0 (or raid-5) with a tenfold increase in performance, without taking up all the PCI slots.
the TX2 is a nice little card, but you can only use 2 drives per board for getting the "full speed" (else if you use master/secondary, 4 drives will give you the raid speed of 2 in stripe) and then you'd have to stripe your raid-0 drives in software. Instead of wasting PCI slots and using an underperforming card, you pay a couple of bucks more and you get the real thing with full speed and hardware raid5.
There are a lot of raid benchmarks at storagereview.com as well. IDE raid is so damn cheap.
--- Metamoderating abusive downgraders since my 300th post.
Sam's clum in the area recently had a WD 100GB dide hard drive on sale for $120 after rebate. 1TB at that rate is ~$1320, plus a few hundred for the Motherboard, processor memory and extra controller cards, and a TB server is within reach of an 18-year-old who saved his paper-route money.
The real question is: how long will it take to listen to all those mp3s? At some point, extra storage just isn't practical because you can't fill it fast enough.