The Amazing $5k Terabyte Array
An anonymous reader writes: "Running out of space on your local disk? How about a Terabyte array for only a few thousand dollars. This article at KCGeek.com shows how to put together 1000 Gigs of hard drive space for the cost of a few desktop computers."
I could rip my entire anime collection for instant access! Rip all my
CDs and still have .9 Terabytes left! Maybe Mirror Usenet! I guess
the simple truth is that now that 100 gig drives are a couple hundred
bucks, we now have the ability to store anything we reasonably could
need (unless you define "Reasonable" as "I need to store DNA Sequences").
Its only a matter of time 'til video becomes as commonplace as MP3's on our drives. 100 Gigs is what...20 movies??? I don't see my appetite for disk space slowing down any time soon.
Hmmm...video; logfiles that don't roll over - ever; online network backup... I'm sure to figure out a way to fill that terabyte. :)
BRENT ROCKWOOD, EST'd 1975
Actually a DNA sequence is only about 3GB for a human - you're anime DVDs might take more space, at least until you compress them. Then again, DNA should be fairly trivial to compress highly. Let Z = CA, Y = TG, .....
"Computer Science is no more about computers than astronomy is about telescopes."
-E. W. Dijkstra
Nobody should ever have need for more than 640 kB of RAM Bill Gates
Simularities anyone?
Sig (appended to the end of comments I post, 54 chars)
1 Terabyte = 1024GB = 1048576 MB
/1048576 is a price of $0.0047 a mb.
$ 5,000
Or another was $4.88 for a GB.
Now who remembers when harddisks where more than $10 a mb.
Cruise TT
I've been using these for a long time (6200 dual-port in hardware-mirror, up to the 8-port cards for large disk configs), and they're very fast and reliable. Cheap, too.
$500 for an 8-port 64-bit RAID controller, looking to the host like a single scsi device per logical volume, seems like the best deal available. Along with a motherboard with sufficient slots for gig-e and these cards (easy to get 4 64-bit slots...maybe you can get more with 3-4 buses), and a 4U rackmount case with 16 drive bays, and you can have 4U of rackmount storage for $5k, too.
I've been using setups like this for clients, as well as for private file storage (divx, mp3, backups, etc.), and know of people using them for USENET news servers (one of the most demanding unix apps for reasonably priced hardware).
It goes without saying you want a journaled file system or softupdates when you have disks this size, and ideally keep them mounted read-only, and divided into smaller partitions, whenever possible. e2fsck on a 300GB partition with hundred of open files is painful.
Yes, this is a groovy/geeky/cool solution for under your desk, but at least spend the extra dollars for a SCSI card and tape backup unit. You could fit the whole thing on a few DLT's. You can also keep incremental backups to keep the tape swapping to a minimum.
Check out this article referenced by slashdot on July 20 2001.
The nice thing about this article is that the people building it at SDSC really took extreme care in getting quality components that would work together to build a reliable, solid system, and still didn't spend more than $5K for a terabyte file server. In particular, the tradeoff of disk speed vs. power consumption was extremely insightful.
I built one of these to their spec for my company, and I couldn't be happier. It's worked flawlessly since then. It's not clear if the Escalade boards are still available -- 3ware had said that they were discontinuing them, but they still appear to be for sale.
thad
I love Mondays. On a Monday, anything is possible.
1) "Compress" at a higher rate than the CD uses (I've seen this)
2) Use POV Ray to render Lord of the Rings for the cinema
3) Keep every src and every
4) Set the Linux swap space to be "500Gb" because you've upgraded the Kernel to the new VM stuff and it looks cool
5) Install Windows XP+ in two years time, with Office XP+.
Imagine that "Minimum Reqs: 1TB of available disk space"
It will happen
An Eye for an Eye will make the whole world blind - Gandhi
Inspired by Slashdot's earlier story that was nearly identical, and with the help of Peter Ashford from ACCS, we built two servers, both with capacities well over a TB, for around $8000 each. They have the capacity to expand to 3TB if need be.
Story here
As far as performance:
(from my memory)
EXT3: About 16MB/Sec block write, 45MB/sec block read
ReiserFS: About 20MB/sec block write, 130MB/Sec block read (that's no typo).
XFS: About 30MB/sec block write, 85MB/sec block read.
It seems that file system plays a large role in performance. The arrays are three RAID5 in hardware using Linux software RAID0 on top of the RAID5 arrays to tie them together.
IDE RAID controllers are 3ware Escalade 7810. Write performance can be greatly increased by using 7850 cards that have more cache.
We stuck with XFS, Reiserfs had a bigfile bug, files created over 2GB would lock up the computer basically. XFS in general seemed much more mature, reiserfs seems more like someone's college thesis project, that they never cleaned up to be production grade.
We experimented with different RAID0 stripe sizes, the hardware RAID5 stripe size is fixed at 64k, there are 7 active disks in each array and one hot spare. Stripe size tweaking seemed to mostly trade off read for write speed, within a certain range of values, with a taper off in performance at either extreme, (down around 8k stripes, or over 1024k stripes)
We eventually went with 1024k stripes. That is what the benchmarks above reflect. The variance in file system performance could very well be due to interactions with stripe size, but there seemed to be common themes (reiser always read fastest no matter what stripe, XFS was always better at writes)
I have been in so many arguments with SCSI zealots on here over this RAID... I wish people would understand what price/performance ratio means. IDE isn't a superior technology, but every now and then, it is the right tool for the job, when price is a goal too.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
Is this any more special than the last time
.. 10 of these would give you over 1 terabyte in useable space in raid 1.. Or if you just cared about write performance, 6 of them for $1554 would give you a terabyte of useable storage.. another $600 to throw together a cheap pc and cheap ide raid cards.. you get it for under $2500.. big deal.
slashdot announced an amazing terabyte arrayHere
Seriously though.. People's numbers are pretty far off. This can be done for about 3000.. Pricewatch
has 160 gig drives for $259
Lately I'm realizing how awful IDE really is.. I finally got around to throwing 2 36 gig ultra 160 drives on my box with an adaptec scsi card, running ext3 on top of a raid mirror.. more space than I need (I just keep all my mp3s on an IDE raid.. since my dragon motherboard has ide raid built in).. Since I've gone to scsi life has been happy. I can do things while compiling, while vacuuming my db, etc..
Funny how mac used scsi before the rest of us, huh?
"And how can this be? For he is the
Aren't these types of systems more for archiving massive amounts of data than actively working on it? I mean, how much data can a computer actively process anyway? Wouldn't a 100GB drive meet just about any processing demands (genome tracking, video editing, etc)?
Why not use slower but MUCH cheaper offline storage? I really like the design goal of
http://www.dvdchanger.com/
You can easily get 1TB of storage with such a device for less than $1000. True, only one person can access it at a time but that is only because PowerFile wants to charge more for so-called "networked version".
In theory, if someone could figure out how to build on of these things, you could throw in a two or three CD/DVD drives for accessing and a 20GB hard drive to buffer images. Boom. Now you have the perfect storage backbone for a house-wide media center. I just wish Linksys or someone would throw a linux thinserver onto of the PowerFile hardware and get me something cheap and network-ready.
- JoeShmoe
.
-- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
In case you didn't notice, it's RAID5. One hard disk could go bad with no issues other than slowdown.
They could also do what we did with our IDE TB. We used three RAID5s in hardware, each with hot swap. In theory, if they failed just right, we could lose up to 6 drives without losing any data.
The three RAID5s are hardware RAID0ed together. The worst case scenerio is a simultaneous failure of two drives on the same array. But we saved so much money using IDE that we just built two complete systems for less price than SCSI. So really, we would have to hit the worst case scenerio twice at nearly the same time to have a total loss.. It gets less and less likely.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
I figure this is the easiest way to add as you grow without having to break open the case and try to figure out how to add another damn drive in there. For backup, just have two systems with identical capacities and rsync between the two nightly.
RAID is nice, but for home use, it's not as nice as a nightly mirror. Why? I've seen RAID controllers fail and take out an entire RAID set. RAID also doesn't deal with the "Holy shit, I just accidently type `rm * ~` instead of `rm *~` problem."
I believe that Promise makes the SuperTRAK Pro series of ATA RAID cards that support up to 6 drives and RAID 5. I haven't used them personally but they do exist.
I agree that on a server or a professional workstation SCSI is the way to go for speed and reliability. But for the home consumer who wants to work with digital video the cost of a SCSI RAID set up is extremely prohibitive.
FIRE!
Any serious data store needs to include a backup system which allows for copies off-site. Fire is the obvious risk of course, but floods, vandalism and lightning strikes are all possibilities.
AFAIK the only generally available tape backup for something this big is DLT, which IIRC can now do around 40GB per tape before compression. With the 2:1 compression usually quoted thats 80GB per tape, or around 13-14 tapes for a full backup. So you really need about 30 tapes for a double cycle, and maybe more if lots of the data is non-compressible (like movies). But this stuff ain't cheap. DLT drives start at around £1000 and the tapes cost £55 each. So thats around £2500 = $4200 to back this beastie up.
Having said that, the possibility of using hot-swappable IDE drives as backup devices is intriguing. Just point your backup program at /dev/hdx3 or whatever. One big advantage is that if your tape drive gets cooked in the server-room fire you don't have the risk of tapes that can only be read on the drive that wrote them. A Seagate 5400RPM 60GB drive costs £110, which is only a third more per megabyte than a bare DLT tape. Two cycles-worth of backup (34 drives) would be £3,700. And you can probably do better by shopping around. For servers with only a few hundred GB on line this might well be more cost-effective than buying a DLT drive.
We use Amanda to do backups here. Its a useful program, but it can't back up a partition bigger than a tape. So you need to think carefully about your partition strategy. (Side note: you can use tar rather than dump to break up over-large partitions, but its still a pain).
Suddenly that terabyte starts looking a bit more expensive.
Paul.
You are lost in a twisty maze of little standards, all different.
Does anyone out there actually use IDE drives like this? It seems a pretty obvious thing to do.
Paul.
You are lost in a twisty maze of little standards, all different.
With tapes, you just get a new drive.
Okay... I'll do the stupid things first, then you shy people follow.
[Zappa]
What do you mean "if you don't need redundancy", the only RAID level that doesn't offer redundancy is RAID-0. RAID-5 can tolerate single disk failures, and if you do multiple levels of RAID-5 you can tolerate more failures (depending on how you configure it). The common configuration of RAID-5 with available hot-spares is quite sufficient in all but the most critical configurations, especially if it is a system that is closely monitored. Sure, you can build RAID-1 arrays of N drives where you can tolerate up to N-1 drive failures without problems, but for one space is used a lot less efficiently and for another write performance decreases for every extra level of redundancy you add, but that is overkill for most situations, the chances that multiple drives will fail simultaneously (or within a few hours of each other) is significantly remote compared to single drive failure probability.
XML is like violence. If it doesn't solve the problem, use more.
Actually, if you did read the article, you would find that the proposed systems is build on ATA100 supported by RAID5 software... which mean that the last of the 8 160GB drive, would be used for parity and that leaves *ONLY* (7*160GB)/1024= 1.09375TB! Now, i know that hardware RAID5 is expensive, but just think for a second: you would have hot-swapable secure-as-long-as-only-1-hard-drive-fail personnal massive-and-fast storage system... A dream system :)
I live in Soviet Canuckistan you insensitive clod!
Video is the most bulky storage people would save. How much would people want to save for re-viewing? First you have the time-shifting stuff like TiVo/Replay- perhaps a few tens of hours at most. Then you would be your favorite movies and TV series. As video-phone improves you might be saving some hours of friends and relatives video conversations. With infinite storage, the constraint becomes need and time to view all that stuff. And you'll probably be wanting to spend your time looking at new stuff. So I'd guess most people's real needs would be hundreds to a thousand hours. At 1-2 BG per hour, your talking about a terabyte or two.
I don't include the argument that you'd have trouble finding old stuff. Computer software is more clever at organizing things - far better than material storage. A good recent example of this is Apple's "iPhoto" that much more convenient for organizing thousands of photos than physical albums.
Ironically, I just built something very similar to this a few weeks ago (it runs great BTW), but I spent <$1500US on all the components. The biggest thing you have to watch out for is the Hard Drives. I went for the ones with the best bang/buck ratio at the time (Maxtor 80GB 5400RPM drives). This let me build a system with well over 1/2 a Terabyte of usable space at a fraction of the cost. Additionally, the slower drives require less power and less cooling, making them easier to fit in a standard full tower case with a merely beefy (as opposed to server-class) power supply. I think the processor requirements he stated were a little overboard as well. I've found that disk access tends to be limited by the PCI bus (it doesn't help that I used an older motherboard with 33 Mhz 32bit PCI), especially on writes where you can spread data across the write cache on the drives. Be careful when you build an array like this, ATA *hates* having access to both a master and a slave drive at the same time. Be sure to avoid having two disks on the same plex on the same controller. This was natural for me fortunatly, since I was building two plexes, a "backup" and a "media" plex.
A final word of warning: Promise ATA100 TX2 controllers may look like a natural choice for a server like this, but they only support UDMA on up to 8 drives at once, and Promise's tech support only supports a maximum of 1 (one!) of their cards in any system.
I read the internet for the articles.
"Draco dormiens nunquam titillandus."
please tell me how you get 6 IDE drives on a pc that gives you any performance in a rad function...
I don't know how he does it, but I have personal experience in doing it two different ways:
1) 3ware IDE RAID controller, has 1 IDE controller per drive on the card (i.e. 8 ide controllers), which the firmware maps to a RAID Device. Depending on the RAID configuration the drives appear as one large SCSI drive to the system.
Performance is on par with SCSI.
2) External IDE-SCSI Raid chassis. Again, 1 IDE controller per hot-swap drive, appearing to the system as one or more big SCSI drives, controlled by a standard SCSI controller. Speed and reliability have surpassed that of a $60,000 SCSI solution sold by Sun I happen to have lying around.
U160 SCSI drives will give you at least a 70% speed increase and a 80% increase in reliability....
If I had to store a terebyte of information I'd be an idiot to use consumer level storage (IDE).
Nonsense, see above. This is simply SCSI bigotry (I know, I was once a SCSI bigot too). What you say is only true if you are using low end cards, with more than one device on each IDE bus, which is untrue for mid- and high-level IDE-SCSI solutions such as 3ware and various external chassis systems. We run our entire enterprise on one, and have done so for well over a year, with much better reliablity and performance than an older, very expensive SCSI solution provided.
But yes, if people are plugging drives into el cheapo IDE "raid" cards like Promise and the like, or worse, into their onboard IDE controllers (most of which are inexpensive knockoffs anyway) then performance will be very suboptimal, and reliability problems (one device taking down the entire IDE bus, etc.) abound.
The Future of Human Evolution: Autonomy
using a tb array for anime is like having one of your turds bronzed.
> He's trying to use software raid, but he has 4
> Promise FastTrak 100TX2 raid controllers. WTF?
> First off, each of those cards supports 4
> drives on 2 channels... Why does he need 4
> cards when he only has 8 drives? He only needs
> 2 cards.
I'm a firmware engineer for Maxtor... if you're going for performance, you want 1 drive on each bus, and you don't want to use the motherboard connectors. With 2 drives on each bus, you are limiting the average transfer rate out of cache to 50% of the max transfer rate. On a modern drive with their 60-65MB/sec channel rates, you cannot stream sequentially off of 2 drives without saturating an ATA-100 cable. Even running ATA-133 won't help starting a year from now.
Additionally, every bios I have looked at sucks in terms of performance. In most cases they have small DMA FIFOs which stutter the pipe during high speed transfers -- they literally hang the DMA lines while they empty their fifo into memory, then come back and grab another 8 words or something sad. They also tend to be very poor managers of the IRQ line. This causes delays at times when your hard drive could be giving you more data, but the host hasn't gotten around to asking for it yet.
All the 3rd party cards have like 2Kbyte FIFOs which prevents any overrun from occurring, which alone is quite helpful in high bandwidth applications.
The cards we include with our drives are in the lower end of Promise's spectrum... you can spend more and get more performance if you want to, which is what I suspect the author of the original article did.
--eric
More data, damnit!
I've wanted a terabyte of storage since the mid-1970s, when I realized that there were approximately a trillion square meters on the Earth's surface. Store one byte of grayscale image for each square meter and that's a terabyte of data right there.
Of course these days I'd want 3TB so I could store color images.
> They really really need to design a IDE-II
> specification that gives the SCSI performance
> traits to IDE.
They already have it -- tag command queueing has been in the ATA spec for years, since ATA-5 I think. Most vendors either have command queueing IDE drives, or are coming out with them soon.
http://www.t13.org for more info on the various ATA specifications
--eric
More data, damnit!
> Last time I looked at IDE in any technical
> depth, I only saw four addresses "reserved" for
> IDE controller use. I guess you can have any
> address, but the BIOS couldn't boot off any
> address, it has to know where to look for the
> controller. Predetermined list of 4 seems to
> ring a bell.
There are 4 addresses, but you can only boot off the first 2 in most operating systems. There are ways to get more than 4 up and running to expand to lots of drives, but not sure what OSs it works with.
> Secondly, IDE seems to REALLY hit the breaks
> when you do two independant operations on two
> drives on the same channel (say, a read on
> drive 1 and writer on drive 2).
The issue is that most ATA implementation don't support command queueing, therefore there is no bus release. Each command finishes to completion until the bus is released, while the other drive sits idle. Upcoming drives will be implementing queueing and won't have this performance limitation.
> If my 4 controller addresses educated guess is
> right, and performance does crawl, you'd
> probably want to have 4 drives on 4
> controllers, one each.
The secondary port isn't inherently slower than the primary port. However, each port uses a controller address. (0x178 or something for the first, can't remember offhand)
Best performance is achieved with one drive per cable.
> If all the above is correct, this guy is plain
> wrong. He's published, I'm not, I'm willing to
> admit defeat - where am I wrong? Do the raid
> controllers emulate being scsi hosts, run off
> OS drivers (=likely windows ones), etc?
Everything except ATA hard drives are emulated as SCSI hosts. ATAPI (the CDROM protocol) is simply a packet scsi over an ATA cable. The raid controllers also just use the built-in scsi layer in the OS.
eric
http://www.t13.org for the real ATA specs if you're curious
More data, damnit!
1 Terrabyte solution - $2500
All the pr0n you could ever watch - $1,000,000
The look on your Mom's face when she clicks on AsianDogAssRape10.mpg - Priceless
This
Moore's law has been in effect for some time since then, and the human genome hasn't gotten any bigger in the meantime.
In fact, the EMBL database (all known DNA + protein sequences) nearly tripled in size within the 11 months of Nov. 1999 - Aug. 2000 [Stoesser, 2001]. Shake your Moore's law at that figure, matey.
I used to build a similar kind of raid system (half a TB) using the Antec case. Their case is nice, but not for the IDE raid. The problem is that the IDE cables need to be within certain length in order to get DMA 5. The case is designed for scsi, which has a longer cable length limit. To hook up all the IDE drive in that case is really a pain in the butt.
c km ountchassis_4ud.htm
For IDE raid, this case is good except it's a bit expansive:
http://www.rackmountnet.com/rackmountchassis/ra
It can hold up to 16 drives with hot swappable trays. There should be no cable length problem.
On a side note, I used to plugin 5 Promise Ultra100TX2 cards in one computer. All cards are recognized but only 8 drives are recognized correctly (I plugged in 12 drives altogether). I remember seeing some where (either in linux kernel source or FreeBSD sys source) saying that Promise has a limit of 12 drives per system, with 8 of then in DMA mode, and the rest 4 in PIO mode with some tweak (burst?). So for a big raid like that, an ide raid cards (either 3ware's or high point's) are recommended. Using a hardware raid ide card also has the benefit of being able to hot swap the drives with the case mentioned above.
gd
Storage solution: 1TB RAID5 storage array (Prices are from Pricewatch) Quantity Price Subtotal Intel Celeron 700 MHz w/ Socket 370 MB, UDMA 100, AGP VIDEO 8~64MB shared only, Sound, 56K AMR Modem, 10/100 Network in MidTower case w/Powersupply 1x$135.00=$135.00 Power Magic PCI IDE U/ATA100 RAID Controller w/Cable 4x$22.00=$88.00 Maxtor 4G160J8 5400/133 8x$259.00=$2,072.00 60.0GB EIDE Ultra DMA 5400 1x$85.00=$85.00 Total: $2,380.00 - Mangoless
[a mango-free monkey]
Get a 3ware escalade card in march they'll support 48bits-LBA in the new firmware, you'll be able to hookup those 160GB monsters in raid-0 (or raid-5) with a tenfold increase in performance, without taking up all the PCI slots.
the TX2 is a nice little card, but you can only use 2 drives per board for getting the "full speed" (else if you use master/secondary, 4 drives will give you the raid speed of 2 in stripe) and then you'd have to stripe your raid-0 drives in software. Instead of wasting PCI slots and using an underperforming card, you pay a couple of bucks more and you get the real thing with full speed and hardware raid5.
There are a lot of raid benchmarks at storagereview.com as well. IDE raid is so damn cheap.
--- Metamoderating abusive downgraders since my 300th post.
Ok. This is just inane. Why build this when someone has already done it better for cheaper?
http://www.raidweb.com
We purchase their 8 disk IDE RAID arrays. They are hot swap, support RAID 0, 0+1, 1, 3, 5, and hot spare, have dual failover power supplies, come with 64MB cache, which can be upgraded. Configurable via the EZ front LCD display, or via serial console. They support ATA-100, and ATA-133 coming shortly. Software upgradable, and it runs Linux.
They array (sans disks) runs us $3200. They even have versions that have dual fiber ports out the back.
WARNING - DO NOT purchase these with IBM GXP75 (75GB) disks like we did... we have about 80 of them that failed.