The Amazing $5k Terabyte Array
An anonymous reader writes: "Running out of space on your local disk? How about a Terabyte array for only a few thousand dollars. This article at KCGeek.com shows how to put together 1000 Gigs of hard drive space for the cost of a few desktop computers."
I could rip my entire anime collection for instant access! Rip all my
CDs and still have .9 Terabytes left! Maybe Mirror Usenet! I guess
the simple truth is that now that 100 gig drives are a couple hundred
bucks, we now have the ability to store anything we reasonably could
need (unless you define "Reasonable" as "I need to store DNA Sequences").
I love calculus so much, I want to give it to everyone! Come, get some integration!
I wonder if it has the capacity for any RAID form, or if it already has a RAID system built in.
Kyle "DotCom" Lynch
...I need some cheeze-its...
Its only a matter of time 'til video becomes as commonplace as MP3's on our drives. 100 Gigs is what...20 movies??? I don't see my appetite for disk space slowing down any time soon.
Hmmm...video; logfiles that don't roll over - ever; online network backup... I'm sure to figure out a way to fill that terabyte. :)
BRENT ROCKWOOD, EST'd 1975
Now that just roxxors my boxxors! that's a lot of space!I would still end up filling it with useless junk, my crap tends to expand to fill the HD space strange that...
yeah , with 160 gig ATA drives out now,
you can do it with 6 drives vs. 10 drives,
and alot of motherboards come with onboard
RAID, and if you use software RAID via
win2k or Volume manager type app for Linux
it would rock .
Cheap too, at $260 per drive per pricewatch .
Peace out...
Actually a DNA sequence is only about 3GB for a human - you're anime DVDs might take more space, at least until you compress them. Then again, DNA should be fairly trivial to compress highly. Let Z = CA, Y = TG, .....
"Computer Science is no more about computers than astronomy is about telescopes."
-E. W. Dijkstra
human = 3 billion base pairs
= 6 billion bits of data
= 7.5e8 bytes
= 7.3e5 kilobytes
= 715 megabytes
< 1 gigabyte
Sure, lots of other life forms have been sequenced too, but most of these have much smaller genomes than humans.
So how would you need a terabyte to store DNA sequences?
Nobody should ever have need for more than 640 kB of RAM Bill Gates
Simularities anyone?
Sig (appended to the end of comments I post, 54 chars)
1 Terabyte = 1024GB = 1048576 MB
/1048576 is a price of $0.0047 a mb.
$ 5,000
Or another was $4.88 for a GB.
Now who remembers when harddisks where more than $10 a mb.
Cruise TT
A terabyte isn't any thing special. But it's cool to see someone doing it. I was bored once one night. For a mere 36K you could, assuming you already own a Thunder K7 w/ the on-board SCSI pluss needed components, put together your self some really big storage. Using those 181GB Seagate SCSI drives.
;p
U160 and all of it churning at 10,000RPM. For a grand total of a few GB short of 5.5 Terabytes.
But assuming you can affoard Thirty 1200$ drives you should be able to spring for a nice U160 SCSI RAID Card with an external connector
I couldn't even find a case with enough room for 30 hd's.... and I don't want to even think about cooling.
But I wont have to worry about that. I can't even affoard a 9gb scsi drive at this point.
Computational Madness in a round package.
I've been using these for a long time (6200 dual-port in hardware-mirror, up to the 8-port cards for large disk configs), and they're very fast and reliable. Cheap, too.
$500 for an 8-port 64-bit RAID controller, looking to the host like a single scsi device per logical volume, seems like the best deal available. Along with a motherboard with sufficient slots for gig-e and these cards (easy to get 4 64-bit slots...maybe you can get more with 3-4 buses), and a 4U rackmount case with 16 drive bays, and you can have 4U of rackmount storage for $5k, too.
I've been using setups like this for clients, as well as for private file storage (divx, mp3, backups, etc.), and know of people using them for USENET news servers (one of the most demanding unix apps for reasonably priced hardware).
It goes without saying you want a journaled file system or softupdates when you have disks this size, and ideally keep them mounted read-only, and divided into smaller partitions, whenever possible. e2fsck on a 300GB partition with hundred of open files is painful.
Unless the Human Genome project re-invented the CD...
Yes, this is a groovy/geeky/cool solution for under your desk, but at least spend the extra dollars for a SCSI card and tape backup unit. You could fit the whole thing on a few DLT's. You can also keep incremental backups to keep the tape swapping to a minimum.
Step 1) 1 x Promise 6 channel PCI ide raid controller, 99$US.
Step 2) 12 x Maxtor 160gb ata133, 270$ each.
Step 3) 1920gb of Pr0n and other goodies.
Check out this article referenced by slashdot on July 20 2001.
The nice thing about this article is that the people building it at SDSC really took extreme care in getting quality components that would work together to build a reliable, solid system, and still didn't spend more than $5K for a terabyte file server. In particular, the tradeoff of disk speed vs. power consumption was extremely insightful.
I built one of these to their spec for my company, and I couldn't be happier. It's worked flawlessly since then. It's not clear if the Escalade boards are still available -- 3ware had said that they were discontinuing them, but they still appear to be for sale.
thad
I love Mondays. On a Monday, anything is possible.
I hate to rain on everyones parade (I really do). But this is just a typical IDE raid 5 setup with bigger disks. Not exactly slashdot worthy IMHO. If you're thinking about doing somthing like this, Raid Level 5 is not a bad choice if you don't need redundancy. For more raid info check out:
http://www.acnc.com/04_01_00.html
Oh well... it'll get the Opera users, maybe.
1) "Compress" at a higher rate than the CD uses (I've seen this)
2) Use POV Ray to render Lord of the Rings for the cinema
3) Keep every src and every
4) Set the Linux swap space to be "500Gb" because you've upgraded the Kernel to the new VM stuff and it looks cool
5) Install Windows XP+ in two years time, with Office XP+.
Imagine that "Minimum Reqs: 1TB of available disk space"
It will happen
An Eye for an Eye will make the whole world blind - Gandhi
I'm sure some poor fool will do something like this, fill it up with data, then have ONE hard drive go bad, making everything practically useless.
What we need isn't larger hard drive storage (not that it's a bad thing) we need more speed, and a cheap, gigantic & ultrafast tape backup system to backup all the data. Some PC designs that use better cooling methods would be very nice as well.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
it's in my head
In fact I remember reading somewhere about a year ago on the linux terminal page about how they put a tb server together for right around 4K I can't find the link, but if someone does please post. But grabbing the third largest drive (100GB) out there will save you a bundle and you still only need 10.
Do you changes clothes while making the "chee-chee-cha-cha-choh" transformation sound?
That reminds me, I don't know where the hell the tape manufacturers think they're marketing to, but with 80 GB hard drives common now, it's rare to find a tape backup solution that is affordable for a consumer that can handle that much. By affordable I mean drives around $250 and tapes under $10/piece for at least 50GB of storage. I've seen some of the proprietary drives but the tapes cost almost as much as the drive! 5 or 6 years ago the backup drives available to consumers could handle backing up the entire average hard drive of the time onto a $15 tape (Travan), but now people are probably just doing without backups which is a disaster waiting to happen.
pfft, these days people are demanding a terabyte of RAM.
How we know is more important than what we know.
It seems to me that having the ability to throw together a low cost TB storage array while novel, is not that big of a surprise. As most of you probably do, I remember a time when hard drives were a novelty themselves coming in sizes of like 10 MB.
The natural progression of storage space seems to be one where the price is constantly dropping while size is constantly increasing. Is it really going to be all that long before you can buy TB sized storage devices in a single unit?
How about another terabyte array and rdiff? While Joe Average User probably isn't going to be able to afford to do that, he's probably not going to be able to want to build the first one either. If you're a small to medium size company, it'd probably be worth considering. I think by the time you start talking this price tag, you'd be considering some of the mainfraime storage companies for DASD and backup though. IBM's 2105 "Shark" machine will go larger than 11TB now, IIRC, and I'm sure the other "big iron" shops have similar solutions.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Am I the only one around struggling along with a 10 GB hard drive?? What could someone need 100 GB for, someone tell me.
Inspired by Slashdot's earlier story that was nearly identical, and with the help of Peter Ashford from ACCS, we built two servers, both with capacities well over a TB, for around $8000 each. They have the capacity to expand to 3TB if need be.
Story here
As far as performance:
(from my memory)
EXT3: About 16MB/Sec block write, 45MB/sec block read
ReiserFS: About 20MB/sec block write, 130MB/Sec block read (that's no typo).
XFS: About 30MB/sec block write, 85MB/sec block read.
It seems that file system plays a large role in performance. The arrays are three RAID5 in hardware using Linux software RAID0 on top of the RAID5 arrays to tie them together.
IDE RAID controllers are 3ware Escalade 7810. Write performance can be greatly increased by using 7850 cards that have more cache.
We stuck with XFS, Reiserfs had a bigfile bug, files created over 2GB would lock up the computer basically. XFS in general seemed much more mature, reiserfs seems more like someone's college thesis project, that they never cleaned up to be production grade.
We experimented with different RAID0 stripe sizes, the hardware RAID5 stripe size is fixed at 64k, there are 7 active disks in each array and one hot spare. Stripe size tweaking seemed to mostly trade off read for write speed, within a certain range of values, with a taper off in performance at either extreme, (down around 8k stripes, or over 1024k stripes)
We eventually went with 1024k stripes. That is what the benchmarks above reflect. The variance in file system performance could very well be due to interactions with stripe size, but there seemed to be common themes (reiser always read fastest no matter what stripe, XFS was always better at writes)
I have been in so many arguments with SCSI zealots on here over this RAID... I wish people would understand what price/performance ratio means. IDE isn't a superior technology, but every now and then, it is the right tool for the job, when price is a goal too.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
Is this any more special than the last time
.. 10 of these would give you over 1 terabyte in useable space in raid 1.. Or if you just cared about write performance, 6 of them for $1554 would give you a terabyte of useable storage.. another $600 to throw together a cheap pc and cheap ide raid cards.. you get it for under $2500.. big deal.
slashdot announced an amazing terabyte arrayHere
Seriously though.. People's numbers are pretty far off. This can be done for about 3000.. Pricewatch
has 160 gig drives for $259
Lately I'm realizing how awful IDE really is.. I finally got around to throwing 2 36 gig ultra 160 drives on my box with an adaptec scsi card, running ext3 on top of a raid mirror.. more space than I need (I just keep all my mp3s on an IDE raid.. since my dragon motherboard has ide raid built in).. Since I've gone to scsi life has been happy. I can do things while compiling, while vacuuming my db, etc..
Funny how mac used scsi before the rest of us, huh?
"And how can this be? For he is the
Can users of other browsers give me some feedback here please???
Ta...
The first thing that runs through your mind when you see the above headline is: "Wow, imagine a Beowolf cluster..."
Argh.
And remember kids: Never trust a computer you can actually lift.
Why not snap in a Promise SX6000 for like $250?
This neat piece DOES hardware RAID5, so you don't need a fast cpu&mobo, less RAM, and since it can only manage up to 6 drives you can even have 2 as pseudo hot spare...
The only drawback is the ability of "only" storing 800GB which is nice at this even cheaper price...
We're using rsync over ssh... These IDE TB's were so cheap, we just built two for redundancy. Every night the second one backs up the first one.
We still have a tape robot, but we will only be backing up the most critical of data, our tape robot is only 1.2TB and cost many times what the TB RAIDs did.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
Aren't these types of systems more for archiving massive amounts of data than actively working on it? I mean, how much data can a computer actively process anyway? Wouldn't a 100GB drive meet just about any processing demands (genome tracking, video editing, etc)?
Why not use slower but MUCH cheaper offline storage? I really like the design goal of
http://www.dvdchanger.com/
You can easily get 1TB of storage with such a device for less than $1000. True, only one person can access it at a time but that is only because PowerFile wants to charge more for so-called "networked version".
In theory, if someone could figure out how to build on of these things, you could throw in a two or three CD/DVD drives for accessing and a 20GB hard drive to buffer images. Boom. Now you have the perfect storage backbone for a house-wide media center. I just wish Linksys or someone would throw a linux thinserver onto of the PowerFile hardware and get me something cheap and network-ready.
- JoeShmoe
.
-- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
Gotta Love slashdot Humor!
Sony makes a few. Here are a couple of links. http://www.sel.sony.com/SEL/corpcomm/news/consumer /625.html
http://63.224.30.17/NASApp/HE/browser/dvd.jsp?GXHC _gx_session_id_=GXLiteSessionID--26601574686454386 53&
I think whith the price of this unit you could easy get two of them and miror each other
If you're going to spend $4K on a DLT drive, spend $8K on a DLT tape library that holds 10 DLT's plus 1 cleaning tape and forget about it. Sure it's only 700Gig of backup but you can always compress.. Otherwise upgrade to a 20 DLT tape library box and call it done.
Do not look at laser with remaining good eye.
Hi all, quick question for the brighter bunch.
Last time I looked at IDE in any technical depth, I only saw four addresses "reserved" for IDE controller use. I guess you can have any address, but the BIOS couldn't boot off any address, it has to know where to look for the controller. Predetermined list of 4 seems to ring a bell.
Secondly, IDE seems to REALLY hit the breaks when you do two independant operations on two drives on the same channel (say, a read on drive 1 and writer on drive 2). If my 4 controller addresses educated guess is right, and performance does crawl, you'd probably want to have 4 drives on 4 controllers, one each.
If all the above is correct, this guy is plain wrong. He's published, I'm not, I'm willing to admit defeat - where am I wrong? Do the raid controllers emulate being scsi hosts, run off OS drivers (=likely windows ones), etc?
Thanks - inquiring minds.. yadda yadda.
1. SCSI can handle between 15-30 devices on one good controller, *not* including support for multi-disk changers via LUNs. Most IDE can handle 2-4.
2. SCSI drives don't turn into slugs when you access more than one at a time. IDE does. Want to see it REALLY screw up, access an ATAPI CD-ROM slave the same time as a HD Master on the same controller.
Learning HOW to think is more important than learning WHAT to think.
I figure this is the easiest way to add as you grow without having to break open the case and try to figure out how to add another damn drive in there. For backup, just have two systems with identical capacities and rsync between the two nightly.
RAID is nice, but for home use, it's not as nice as a nightly mirror. Why? I've seen RAID controllers fail and take out an entire RAID set. RAID also doesn't deal with the "Holy shit, I just accidently type `rm * ~` instead of `rm *~` problem."
1 Promise 6 channel PCI ide raid controller, 99$US.
12 Maxtor 160gb ata133, 270$ each
1920gb of Pr0n and other goodies, priceless!
A little dated but still contains usefull insight.
Linux IDE-RAID Notes
Never ascribe to malice what can be adequately attributed to ignorance. -Napoleon
I can think of plenty of innovations that might start using up your terabyte rather fast.
One interesting idea is a transactional file system which logs all file operations. Great advantage of this would be that you could "roll back" to previous versions of files if you needed them. Couple that with some redundant error-correcting storage and your days of losing files will be a thing of the past.
Another great use is for indexes. Indexes generally speed up computing operations, at the cost of significantly increasing storage needs. Chuck some more dish space, and you could effectively have a Google on your own hard drive.... no more time consuming searches, just enter "1999 tax avoidance scheme" and get your hits in seconds.
p.s. I guess it should be "indices" for all the grammar nazis out there, but I actually think "indexes" works better here, as it is a technical term.....
FIRE!
Any serious data store needs to include a backup system which allows for copies off-site. Fire is the obvious risk of course, but floods, vandalism and lightning strikes are all possibilities.
AFAIK the only generally available tape backup for something this big is DLT, which IIRC can now do around 40GB per tape before compression. With the 2:1 compression usually quoted thats 80GB per tape, or around 13-14 tapes for a full backup. So you really need about 30 tapes for a double cycle, and maybe more if lots of the data is non-compressible (like movies). But this stuff ain't cheap. DLT drives start at around £1000 and the tapes cost £55 each. So thats around £2500 = $4200 to back this beastie up.
Having said that, the possibility of using hot-swappable IDE drives as backup devices is intriguing. Just point your backup program at /dev/hdx3 or whatever. One big advantage is that if your tape drive gets cooked in the server-room fire you don't have the risk of tapes that can only be read on the drive that wrote them. A Seagate 5400RPM 60GB drive costs £110, which is only a third more per megabyte than a bare DLT tape. Two cycles-worth of backup (34 drives) would be £3,700. And you can probably do better by shopping around. For servers with only a few hundred GB on line this might well be more cost-effective than buying a DLT drive.
We use Amanda to do backups here. Its a useful program, but it can't back up a partition bigger than a tape. So you need to think carefully about your partition strategy. (Side note: you can use tar rather than dump to break up over-large partitions, but its still a pain).
Suddenly that terabyte starts looking a bit more expensive.
Paul.
You are lost in a twisty maze of little standards, all different.
Amen to that. I was looking for a backup solution for my 60 gig server a few weeks ago. Know what the most cost effective solution turned out to be??? Another damn harddrive!
Have a Happy.
As already stated you could mirror the whole system with another array, but also how much do you actually need to backup. If you are putting your DVD collection on it then you already have permanent backups in the form of the original DVDs.
Does anyone out there actually use IDE drives like this? It seems a pretty obvious thing to do.
Paul.
You are lost in a twisty maze of little standards, all different.
With tapes, you just get a new drive.
Okay... I'll do the stupid things first, then you shy people follow.
[Zappa]
FYI, the DNA sequence isn't that big. The National Human Genome Research Institute has their 90% complete draft burned on a single CD.
Why aren't we told when editors moderate our posts?
"Absolutely. And to those who say "Just build another one" / "RAID doesn't need backup", I have only one thing to say:
FIRE!"
Oh sure.... your house is burning down and you're there thinking "Oh no... my P0RN! I should've gone co-lo!"
The more drives you have the more likely the failure?
Sean D.
"Hmm. I am to metaphor cheese as metaphor cheese is to transitive verb crackers!"
True. But one thing I haven't seen yet is the fact that most backups aren't full backups. You do a full backup maybe one a month or once a year. Every other backup is a diff only. So while the initial backup may take several tapes, the nightly backups shouldn't. At least on the type of system where the data is basically the same from day to day, which was the point of the article.
Plus, as described in the article, where the point was to have a singe hard drive based storage for dvd's and cd's, if there was a drive failure, you could just take the original media and do the rip again. Annoying yes, but doable. You haven't lost data unless the fire burned down your house and melted the cd's at the same time it took out your storage. That's why companies buy fire safes and use off-site storage.
You can do it for a lot cheaper than $5k. About 8 hard drives at 160 GB will only cost about $2,100; that makes a little over a TB of storage. Four drives can be put into one computer, so you need two basic computers, no need for sound, video, periphials, or any other extras. They will only be for file storage so you can easily get those two computers for less then $300 a piece. Just hook up the video so that you can install the needed software. So the price of 1 TB storage should only cost about
$2,700 wow, almost twice as less. If you have more to spare add another four drives and one more computer and you have almost 2 TB for only about $4,140
Question everything.
Picture this; your home PC's no long have disk drives. In fact, any device in your house that has a need for data storage (like your VCR/DVD/CD/Game console/Toaster, etc...) has no local storage.
They all connect to this mammoth central storage unit stuck in a closet or down in your basement using either a wireless device or some sort of networking that is built into every home.
You get on the 'net using your PC console and order a video/cd/game and it miracously shows up on your home storage device ready for use by whatever device is best suited to using it (i.e. use your stereo to listen to music, not your computer and use your game console to play games, not your computer). You use this "product" as if you own (maybe rent it?).
The point being you will need massive storage to pull this off - and I do believe this scenario will play out.
Video is the most bulky storage people would save. How much would people want to save for re-viewing? First you have the time-shifting stuff like TiVo/Replay- perhaps a few tens of hours at most. Then you would be your favorite movies and TV series. As video-phone improves you might be saving some hours of friends and relatives video conversations. With infinite storage, the constraint becomes need and time to view all that stuff. And you'll probably be wanting to spend your time looking at new stuff. So I'd guess most people's real needs would be hundreds to a thousand hours. At 1-2 BG per hour, your talking about a terabyte or two.
I don't include the argument that you'd have trouble finding old stuff. Computer software is more clever at organizing things - far better than material storage. A good recent example of this is Apple's "iPhoto" that much more convenient for organizing thousands of photos than physical albums.
To back up the parent, fire DOES happen. Once upon a time I worked at a small software company in Waltham. I got a phone call one Sat. morning:
"Hey, did you hear about the fire?"
"No, what fire?"
"The office building burned down last night"
It was a shock, to say the least. Sure enough, a squatter had been evicted from the building, so he got revenge by torching the place. We were one of about a dozen businesses - including the local newspaper and an Armed Forces recruiting center - that was in the building. Almost nothing survived.
We had tape backups of all our servers. Guess what - tapes are made of plastic. Plastic melts. We found a few tapes after the fire was put out, but they're not so useful after they've been torched.
Like it or not, fire happens.
Thanks mate.
...this kind of thing was easier. i mean it's great and all: using cheap (as in cost) components, some intellegent construction teams, etc, but in alot of places (even towns that are SMALLER) this is still impossible.
where I live, I can't run new cables everywhere. I can't even purchase the right to do this, and even if I can get the zoning permits I STILL can't do this.
[and not can as in a question of ability, but can as in the whiny i-don-t-want-to-face-the-consequences]
this is because we have a town charter that specifically keeps us technologically backwards (we didn't start getting streetlights until there was the closest this town has seen to a riot since the civil war) - our own telco office continue to use the same wiring that was installed in the 1950's.
Our local cable company was only allowed to "service" existing cable- and they've been VERY VERY Slowly "servicing it" with fibre. I'm 58 kilofeet from the CO (about 12 more from the cable company) and I still can't get a cable modem [i wouldn't want to - but more on that another time]
an interesting project that happened recently: called network maryland was bid-won to Level 3 networking (gloriously known as crap thanks to the business practices of most of their clients) - they ran fibre through most of maryland and stopped less than 30 miles from my town -- and have no intention of continuing (btw: i'm still quite a ways from the shoreline).
i can't tap into that because it's their bandwidth, and i can't purchase it from them because they're not selling (and don't have to - it's already sold to the government) - which as you'll remember won't let me run my own wires because of stupid charters.
now if i wanted to run cross-lada cables (outside of town boundaries) i COULD do that- but it wouldn't do me any good because i couldn't bring a line INTO town.
so i'm stuck with a maximum bandwidth purchase of "only a T1 at a time" (and our little cove will run out soon -- there isn't even a full DS3 running into this entire area) from telco, or I can purchase some of the cable company (who actually has less bandwidth than my company) -
or
or
or nothing. we've tried to convince our town "hall" of doing something like this (gigabit ethernet), and we've tried offering to pay for it.
i suppose we could move, but that's a lot of hassle too -- and we'd be giving up our local business.
or we could get some kind of FEDERAL responsibility -- get the whole fucking nation up.
[at this point: you should realize that i, like many of my peers like to pretend that only the US matters... infact: i actually hate it here, but that should only be so apparent]
so what do we do? how can we sell "this" to our town? how do we get our buerocratic slugs to take something like this.
more importantly, how was this sold in NZ? and how was it sold in other places it was used?
This would be great for a home file server. Many new homes are being built pre-wired with CAT5 (alas not my old house). Just add a big file server in the basement. With proper wiring, it can act as an answering machine / PBX, personal video recorder, music (MP3) repository, mail server, file server, etc. With RAID, you have less worries about a drive crash wiping you out (though you'll need a disaster recovery plan - flooded basements would be real bad). I've always wanted to do this! Main stumbling block is getting CAT5 wiring from the second floor (where my computers reside) to the basement.
[Insert pithy quote here]
I once heard a rumor that sony created a tape backup system using Beta tapes that held 100+ terabytes per tape.
This was in the late 80's early 90's and at the time was just to silly a number so the technology was abondoned.
drawback: read times
At the most people's genomes differ by 0.1% from each other - much less than that if you are relatives. Therefore you'd record the differences, sort of like several of mpeg algorithms.
Ironically, I just built something very similar to this a few weeks ago (it runs great BTW), but I spent <$1500US on all the components. The biggest thing you have to watch out for is the Hard Drives. I went for the ones with the best bang/buck ratio at the time (Maxtor 80GB 5400RPM drives). This let me build a system with well over 1/2 a Terabyte of usable space at a fraction of the cost. Additionally, the slower drives require less power and less cooling, making them easier to fit in a standard full tower case with a merely beefy (as opposed to server-class) power supply. I think the processor requirements he stated were a little overboard as well. I've found that disk access tends to be limited by the PCI bus (it doesn't help that I used an older motherboard with 33 Mhz 32bit PCI), especially on writes where you can spread data across the write cache on the drives. Be careful when you build an array like this, ATA *hates* having access to both a master and a slave drive at the same time. Be sure to avoid having two disks on the same plex on the same controller. This was natural for me fortunatly, since I was building two plexes, a "backup" and a "media" plex.
A final word of warning: Promise ATA100 TX2 controllers may look like a natural choice for a server like this, but they only support UDMA on up to 8 drives at once, and Promise's tech support only supports a maximum of 1 (one!) of their cards in any system.
I read the internet for the articles.
"Draco dormiens nunquam titillandus."
I doubt that video will ever become as commonplace as music, for the simple reason that I can listen to music while doing something else, on the computer or in the room/house/proximity.
With video, I sorta have to pay attention to the moving pictures, and that keeps me from getting other things done.
The REAL jabber has the user id: 13196
What you do today will cost you a day of your life
Actually, I assembled a 600 gig storage device using the afore mentioned 3ware controller.
First, there were hardware bugs and they recalled the controller
Second, 3ware dropped the product line, but vendors were still telling me it was available.
Third, they brought it back, and I had to get a drop ship
I lost about 3 months on design phase due to this little tidbit.
Now don't get me wrong, it's working now and seems reliable... but... there's always this nagging suspicion that something is going to go wrong and I'll lose all that data.
Is there some way to stripe actual systems? Rather than having four or five drives striped in a single system, could I have four or five systems each with one drive, but mounted as one physical volume? How could that be configured?
While this would be a little more expensive, it would be much more fault tolerant. Nothing short of the switch breaking or two systems breaking simultaneously could bring down the system.
I would be afraid to put a system like the one described in the article in use for fear that a power supply or RAM chip would go bad, and all of my data would be inaccessable until I could replace it.
Actually DNA sequence files are not all the large. At least not what most scientists actualyl deal with on a regular basis. Most of the time you are just dealing with fairly small 500-900 base pair seqments, since that is what you can get reliable sequence data back on. Of course sometimes it is needed to prob this sequence against others findings, via some search engine such as Blast! or the like. If you are talking about whole genome sequences, most still are not that large. Considering that most the genomes sequenced are of bacteria and archea, this isn't that hard to see. E. Coli for instance has a genome size of roughly 4.6 million bases and this is fairly average for bacteria, at least within the same order of magnitude. Currently there are 74 completed genome sequences listed on NCBI's Genome page. So, while this is a substantial amount of biological data, it doesn't amount to that many megabytes of data with respect to a Terabyte storage system. At least not today.... with more Eukaryotic organism sequences being complete the size will of course jump dramatically.
Can ANYONE actually find this quote??? I have never seen any actual evidence of Billy saying this. Gates has denied this, and say what you want about his business practices, but he has always been a smart guy. Surely he would never think that 640K would always suffice.
Bashing Gates/Microsoft = +5 funny
Acknowledging the truth = -5 troll
sounds about right
Get rid of it.
everybody likes to complain about how expensive macs are, so apple decided to skip the scsi. my dell laptop has gone through two hard drives and a battery in two years. rarely does a piece of apple hardware fail to perform decently, and powerbook batteries never seem to die (within reason of course). 2c.
extend that to safe-as-long-as-only-one-hd-fails-and-you-never-e
Always remember: data that is not backed up might as well not be there in the first place!
The illegal we do immediately. The unconstitutional takes a little longer.
--Henry Kissinger
i prefer them in the rack in the living room.
With Raid5 a single drive can fail without causing dataloss.
How do you know WHEN a drive has failed?
With the low end IDE RAID cards your notification comes when the 2nd drive fails......
3Ware's website describes a SNMP monitoring utility for windows, but didn't specifically mention Linux support. Ditto for Adaptec.
If the raid is done in software, is there a linux program to monitor and notify when a single drive goes down?
I was curious if there were onboard video on the mother board or AGP or if he was going headless. So STFW for"Asus A7B266-D Motherboard Specs" and narrow further and further realising that "A7B266" is not out there. I head over to Asus's MB Section where I see that that model# seems to look correct but can't find a match. I'm assuming it has integrated n-force. Just thought it was a little odd to not be able to find this board, I'm sure if I searched harder I could find it at something like Pricewatch, but I wonder why it's not featured at Asus's site?
I would like some milk from the milkman's wife's tits
Two things... "off site" and "fire proof safe". Even Interpath wasn't stupid enough to keep all the data in one place. There were three sets of tapes. One set (two months old) were off site (at someone's house in a fire box.) One set (last month's full backups) were in the fire proof safe in the basement. And the current backup set was at my desk in two fire boxes. When the alarm goes off, you grab those two boxes. (At the time, someone was always there.)
instead of the 4 TX100's go with Promises new ATA100 6 channel IDE controller SuperTrak SX6000 which will chop about 300 bucks off your final price..
- what is the definition of simultanagnosia?! I've been meaning to look it up!
please tell me how you get 6 IDE drives on a pc that gives you any performance in a rad function...
I don't know how he does it, but I have personal experience in doing it two different ways:
1) 3ware IDE RAID controller, has 1 IDE controller per drive on the card (i.e. 8 ide controllers), which the firmware maps to a RAID Device. Depending on the RAID configuration the drives appear as one large SCSI drive to the system.
Performance is on par with SCSI.
2) External IDE-SCSI Raid chassis. Again, 1 IDE controller per hot-swap drive, appearing to the system as one or more big SCSI drives, controlled by a standard SCSI controller. Speed and reliability have surpassed that of a $60,000 SCSI solution sold by Sun I happen to have lying around.
U160 SCSI drives will give you at least a 70% speed increase and a 80% increase in reliability....
If I had to store a terebyte of information I'd be an idiot to use consumer level storage (IDE).
Nonsense, see above. This is simply SCSI bigotry (I know, I was once a SCSI bigot too). What you say is only true if you are using low end cards, with more than one device on each IDE bus, which is untrue for mid- and high-level IDE-SCSI solutions such as 3ware and various external chassis systems. We run our entire enterprise on one, and have done so for well over a year, with much better reliablity and performance than an older, very expensive SCSI solution provided.
But yes, if people are plugging drives into el cheapo IDE "raid" cards like Promise and the like, or worse, into their onboard IDE controllers (most of which are inexpensive knockoffs anyway) then performance will be very suboptimal, and reliability problems (one device taking down the entire IDE bus, etc.) abound.
The Future of Human Evolution: Autonomy
It seems to be just an urban legend.
AFAIK the only generally available tape backup for something this big is DLT, which IIRC can now do around 40GB per tape before compression.
Um...AIT. As fast or faster than DLT, same storage volume per media unit as DLT, media is cheaper than DLT, because the media is smaller than DLT libraries/autoloaders tend to be less expensive than their DLT counterparts.
"They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety" - BF
I think a good hardwar controller, like the 3ware
escalade will perform better than the 4 promise
cards in the box. You probably don't need such a
high powered CPU then. Not that 1 GHz is much
anymore.
How's this for a twisted scenario -- we built a terabyte server (actually a 0.8 terabyte server) to use as "backup" for CDROMs we send to clients! It's data that becomes useless over time, so it's discarded 6 months to a year in the future.
Absurdity: A statement or belief manifestly inconsistent with one's own opinion. -- Ambrose Bierce
As seen on the Simpsons:
...get it (ha)rd(ha)r(ha)r
Take the integral of 3d(r^2) (where d is a constant)
The answer is d(r^3) or rdrr
Store more sequences.
60 million sequences for a nice datamining job, and you need a really BIG raid.
Don't answer me. Moderate. Slashdot is about moderation, not discussion.
I use rsync between two large filesystems across a 4 block 802.11b link.
There is still a possibility of an intruder deleting both, but highly unlikely because one of the backups is a laptop that doesn't share resources.
[I still need a versioning filesystem, like VMS though.]
If both building are consumed in an explosion, then data recovery will be the least of my worries.
Joe
Joe Batt Solid Design
Let me see... they are telling us that if we put together enough off-the-shelf hard drives with off-the-shelf raid controller, we get a terrabyte. Maybe it's a math question? I don't see the "innovation."
I guess it has a "wow" factor, but as far as innovation or news? Nah. If you can show me how I can put together a 1TB raid10 array of 10k SCSI drives for under $5k, now that's interesting!
Where would you use it? In a business environment? Maybe for file storage, because the IDE drives are useless for database or other heavy work. In the home? If you have a separate room just for the setup, otherwise the hard drive noise and especially the cooling fans will drive you crazy. Anybody who's been around a real drive array knows what types of fans it takes to cool down an array, and how loud they are!
...what's the performance like? Please, don't compare it to PC-class performance; compare it to real terabyte-level storage systems. How many hosts (of how many types) can you connect to it concurrently? What about instant snapshots, transparent replication, LUN masking, path management? Does it do background memory/disk scrubbing to catch errors before they affect anything outside the box? Can it handle online component upgrades? What kind of service and support comes with it?
There's nothing wrong with a cheap terabyte. Just don't think it's really enough to put EMC/IBM/Hitachi/etc. out of business.
How about the following config? 1.28TB @ $6000
Asus A7M266-D Dual Athlon MB ($.3k)
AMD MP *2 ($.4k)
2GB RAM ($.8k)
Adaptec 29160LP U160 SCSI card ($.3k)
Promise UltraTrak TX8 ($1.7k)
8* Maxtor Diamondmax 160GB drive ($0.3 each)
Gigabit Ethernet Network card ($0.1K)
Others (.5k)
Backup (.3k)
1.28 TB with hotswap etc with respectable performance.
8 160GB drives (1120GB of storage) - $1900
Motherboard with Duron 800MHz CPU and onboard 100BaseTX ethernet - $100
2 512MB SDRAM - $150
2 Promise FastTrak/66 IDE RAID controllers $110
Full tower ATX case - $110
Total price: $2370
(Prices are from stores on www.pricewatch.com, rounded up to include shipping.)
You can stuff 8 60 gb disks into an antec server case. With a pair of 1600 XP processors, the total cost is 2 promise cards = $50, 8 drives = $720, .5 tb and $3000 for the full tb. Further, you have a bit more
.6 tb into a case. When you are paying for floor space and cooling, the 160 gb drives make sense, but when you are tunning these in your basement, going for two boxes makes it a cheaper and more robust solution.
2 xp processors = $220, mobo = $220, memory = $200,
case = $150, total is about $1500 for
i/o bandwidth with 6 ide controllers, and 2 pci busses than with the single. Also when one of them craps out, the other is still going in all probability. Going to 80 mb drives gives you about the same cost per gb of drive space and lets you put
So you have a backup tape. If everthing was torched in a fire where are you going to load that tape to???
I really hate when people jump all over the backup to tape and not really look at offsite locations to load the tape information. This is true backup for businesses and guess what it is not cheap.
Also, RAID isn't fault tolerant to human or software error (rm -rf /).
With 120 gig drives, your total cost for a 1 TB array would be about $2500. With 4 IDE ports and a large enough case, you could get all that into one box, then network the beastie.
Now I just need to find $2500. I know I won't have a problem filling it.
-Restil
Play with my webcams and lights here
For those that choose to go the "fire proof box" route, please be careful that you buy a unit that's certified to protect media. A fireproof box that will protect papers from catching fire isn't necessarily sufficient to keep tapes and disks from being destroyed by the heat. Make sure you buy one that's appropriate for your intended contents.
Easy way to turn facts into falsehoods: just put whatever you don't want people believing on an urban legends site.
using a tb array for anime is like having one of your turds bronzed.
Stop whining!
I've wanted a terabyte of storage since the mid-1970s, when I realized that there were approximately a trillion square meters on the Earth's surface. Store one byte of grayscale image for each square meter and that's a terabyte of data right there.
Of course these days I'd want 3TB so I could store color images.
The other problems with your scheme are:
It's not a bad idea, but certainly not something that can be done for $5k. I'd think there must be a breakpoint somewhere where it makes sense to build stuff in multiple machines (instead of cramming tons of disks into a single machine), but I think it's not at 1 disk/machine.
How much uptime you need is purely dependant on you. Since my array is for personal use, I don't mind a bit of downtime when a component fails (since I'm working on the problem myself anyway, it's not like I'd get much use out of it when it was partially down anyway!). If you really really need multi-9 uptime, $5k IDE storage solutions really aren't the way to go.
I read the internet for the articles.
But can you find any source citing him saying that? I've believed it to be true for a long time too, but there's no proof of it anywhere. I want to believe it, but you can't say it's a fact without proof to back it up.
Linear Tape Open tapes do 100GB uncompressed right now, and the roadmap has them going up to 1.6 TB by generation 4. SuperDLT is also out there.
Two words...
9
11
gotta CoLo a terrabyte
This
Not trying to be a database snob, but anyone can build a piece of hardware. I takes a lot of diligent work and planning to implement a data warehose
Remember, most of the breathless prose about the huge, enormous, gigantic, [favorite-bigness-adjective] amount of information in DNA was written years ago, by biologists. Moore's law has been in effect for some time since then, and the human genome hasn't gotten any bigger in the meantime.
To a Lisp hacker, XML is S-expressions in drag.
Unless Taco is storing DNA sequences from aliens, I don't know what he's talking about. I downloaded the human genome project last year and if I remember correctly it was definitely under a gigabyte.
I totally agree. In fact, I've just been researching backup solutions. A summary my findings so far:
/tape, $1400, 3:45 to write. Tapes are $55 each.
DDS-4 - 5 tape changer, 20 GB / tape = 100 GB for $2500. 9+ hours to write 100 GB. Tapes are $10 each.
DLT - 40 GB
Super DLT - 110 GB / tape, $4700, 2:46 to write. Tapes are $115.
LTO (Ultrium) - 100 GB / tape, $3500, 1:51 to write. Tapes are $100.
All sizes are native, uncompressed, and times given assume no compression, so if the data set is compressed 2:1, then capcity doubles, as does throughput, and write time doesn't change.
1 Terrabyte solution - $2500
All the pr0n you could ever watch - $1,000,000
The look on your Mom's face when she clicks on AsianDogAssRape10.mpg - Priceless
This
[I still need a versioning filesystem, like VMS though.]
I hate to say it, but SCO (yes, SCO) had a versioning filesystem in OSR5. HTFS (High Througput File System) had versioning support.
Fascism starts when the efficiency of the government becomes more important than the rights of the people.
Yea cause we all know only really cool guys like anime.
Honestly if anime is not the gayest of gayest cartoons I don't know what is.
You've just stumbled across one of the main concepts behind the Storage Area Network [snia.org]. The biggest problem you have is bandwidth.
Dude, that's why most SANs are made out of Fibre Channel. FC is a 1GB transport that has a SCSI protocol on top (FCP-SCSI). 2GB FibreChannel is available, and work is currently under way on 10GB. In addition, FC is full duplex.
Fascism starts when the efficiency of the government becomes more important than the rights of the people.
Yeah, it sucks. I'm gonna have to wait until 2005 before I can map my family's DNA and search for anomolies.
Shit. I forgot to put the afteer the first "Fibre Channel". Next time I'll remember to use the "preview" button!
Fascism starts when the efficiency of the government becomes more important than the rights of the people.
We (my company) designed a very similar system using a Tyan Tiger200 with dual GHz Cel's etc. The problem is that the drives he lists (the 160GB Maxtors) aren't addressable by the RAID controller he is using (the Promise TX). The Promise card will only address up to 127GB per drive. You have to use a ATA-133 spec controller to get the full capacity out of those drives. We did an array using the TX and WD 120GB 7200RPM drives (with 8MB cache - mmmmmmmmm.....) that flat smokes anything that you can put together with the Maxtor drives. Oh well....
I think Usenet is underestimated here. I remember reading on the site of one of the larger ISPs, specialized in good usenet access (ie. 30000+ groups & week+ retention even on binaries groups) that they have significantly more than 1 TB of storage space (don't remember how much, but several TB). So mirroring Usenet might be a tight fit.
beauty is only a light switch away
Thats our way to read DNA. The human body can read and start duplicating a entire strand of DNA almost instantanusly. That will beat the hell out of your 15000RPM SCSI drive :).
Yeah, when you spend 250,000 for a TB of disk your backup problems magically disappear!
Seriously I complied a list of reasons to backup a RAID storage solution:
-- Accidental deletes and over-writes.
-- Total catastrophic recovery. (Fire/Flood/War, etc).
So it *may* not be the solution for critical data that needs to be shadowed off-site, nor the only location for difficult to reproduce data.
The IDE disk arrays discussed by other posters use a single controller per disk, so the arguments about multi-disk controller screwups don't apply unless you've built a bad IDE array.
I could build a bad SCSI array if I used wrong design principles.
i believe that there is more problem in performance rather than capacity.
a typical configuration that cheap will use an ide hdd (and to make it cheaper software raid).
the main problem (for us in this case) is the performance. how do you increase the data transfer? for the past few years, the storage space has increased tremendously but the transfer rate of the drives are out of proportion with the space.
ide is usually placed in a 33mhz/32bit bus which will give a burst transfer of 133mbyte/sec. that is the max whatever you do. but if you will place a nic card, they will share the bandwidth unless it is placed in a different bus.
for the interface itself, scsi can handle more i/o operations/sec and fc even more. technologies today can implement raid5 at almost no performance hit.
so given 1tb of data, definite many people will be accessing it (unless you really plan to use it for your insane storage space). so if people will be able to store much, they can access it at a much slower rate.
so you won't see the scsi and fc being obsolute even though the serial ata gets through. it will remain in the low end segment of the storage market.
and besides, if you want to backup your data, the best way is to store it to tape and that will cost big (since mirroring the info in another server will not give you the reliability compared to tape)
Live your life each day as if it was your last.
There is an interesting problem though...do we have the ability (assuming we can hook up however many storage devices we want to one system) to space huge files across them? (DNA, DNA+metadata references, etc...) I wonder when the first terrabyte file will be made. I'd speculate it'd be a [.wad|.mpq|.u|*] file for Uber-QuakenHalflifeTournament XP. ;)
I just found a new sig.
http://www.3ware.com/
No, you still can't build really big servers.
But you can slap 8 160GB drives in a box and drive them all at full speed, or just as close to full speed as SCSI controllers manage with 8 drives on a controller (PCI bus speeds being what they are).
I've been using 3Ware gear for a year and a half - they work. For any system that requires 8 drives or less, there really is no reason to pay SCSI prices.
I wonder how it takes that sucker to Defrag...
Shoulda used a SCSI Controller...
I'm a 2000 man.
I pitty the fool!
Sapere Aude - Homer
This guy totally went the wrong way for expandability and speed. You can get the Promise SuperTrakSX 6000 for $480 and that has hardware raid 5 and supports 6 drives. I'd throw one of those in with 6 drives to start and take my 800Gig and be happy. That would save me at least a $1000 up front. I wouldn't need 2 of the harddrives, the second processor or so much ram. Plus it would be faster and much more reliable. Then later on I could add another one for about $2500 and have 1.6 TB of space to store my huge collection of pornography... err rather mp3's, software and G-rated dvd movies.
If your not cheating your not trying. If your not trying your not winning and if your not winning why play?
Buy 2 cheap PCs £500.
Add 6 extra ide drives £1000
Add cat5e cable x 2 £4
Add hub £40
Add Linux £?
Your network will easily be filled with data from the disks for a total cost £1544 - much less than $5000 and you get some cdrom and floppy drives to use as cup mats.
L0ts 0 pr0n!
Ummm... isn't the whole point of this for use at home? WHY THE HELL would a home user need OFF SITE backup? How many of you guys have off site backup at home? Huh? How many??!! I'm waiting...
-"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
I used to build a similar kind of raid system (half a TB) using the Antec case. Their case is nice, but not for the IDE raid. The problem is that the IDE cables need to be within certain length in order to get DMA 5. The case is designed for scsi, which has a longer cable length limit. To hook up all the IDE drive in that case is really a pain in the butt.
c km ountchassis_4ud.htm
For IDE raid, this case is good except it's a bit expansive:
http://www.rackmountnet.com/rackmountchassis/ra
It can hold up to 16 drives with hot swappable trays. There should be no cable length problem.
On a side note, I used to plugin 5 Promise Ultra100TX2 cards in one computer. All cards are recognized but only 8 drives are recognized correctly (I plugged in 12 drives altogether). I remember seeing some where (either in linux kernel source or FreeBSD sys source) saying that Promise has a limit of 12 drives per system, with 8 of then in DMA mode, and the rest 4 in PIO mode with some tweak (burst?). So for a big raid like that, an ide raid cards (either 3ware's or high point's) are recommended. Using a hardware raid ide card also has the benefit of being able to hot swap the drives with the case mentioned above.
gd
Seeing video being able to go to HDs in decent quality has upped my standard as to whats acceptable for my MP3 Jukebox. Before with the high cost of harddrives, most of my saved songs were ripped with 128. But once I hooked my second PC to my stereo, to act as my sole method of listening to recorded music, I was disappointed by the quality of some songs at 128. So now, with cheaper HD prices, I will have to re-rip my collection at 160 or 192 or maybe even 256.
I take all my... uh... backup copies of original legal program media to friends' houses in case mine burns down.
Would be to replace the 4 controllers, and the monster case. Use a more "standard" chassis. Slap a regular SCSI card in it. And then for the drives themselves, use an UltraTrak100 TX8
to hold the drives.
It just seems like a far cleaner solution. Not to mention FAR more expandable. And works out to be about the same price.
"Politicians are interested in people. Not that this is always a virtue. Fleas are interested in dogs." P.J. O'Rourke
$355 - 3ware Escalade 7810 8-port RAID Controller
$2072 - 8x Maxtor 160GB IDE Drives ($259 each)
You could hook these up in a 7+1 RAID5 array, and you'd have a 1.018TB Array.
Storage solution: 1TB RAID5 storage array (Prices are from Pricewatch) Quantity Price Subtotal Intel Celeron 700 MHz w/ Socket 370 MB, UDMA 100, AGP VIDEO 8~64MB shared only, Sound, 56K AMR Modem, 10/100 Network in MidTower case w/Powersupply 1x$135.00=$135.00 Power Magic PCI IDE U/ATA100 RAID Controller w/Cable 4x$22.00=$88.00 Maxtor 4G160J8 5400/133 8x$259.00=$2,072.00 60.0GB EIDE Ultra DMA 5400 1x$85.00=$85.00 Total: $2,380.00 - Mangoless
[a mango-free monkey]
How about an IBM Ultrium? 100GB per tape, or 200GB compressed. 10 tapes per backup. And that would be a full backup.
In my experience, backing up 300GB of JPEG's, it took approximately 3 hours per 100GB. Connected via a Adaptec 160 card.
Is that a real poncho? I mean, is that a Mexican poncho or is that a Sears poncho?
That way, the powerpoint presentations that your 20 middle managers are making instead of doing real work would be effectively indestructible...
Get a 3ware escalade card in march they'll support 48bits-LBA in the new firmware, you'll be able to hookup those 160GB monsters in raid-0 (or raid-5) with a tenfold increase in performance, without taking up all the PCI slots.
the TX2 is a nice little card, but you can only use 2 drives per board for getting the "full speed" (else if you use master/secondary, 4 drives will give you the raid speed of 2 in stripe) and then you'd have to stripe your raid-0 drives in software. Instead of wasting PCI slots and using an underperforming card, you pay a couple of bucks more and you get the real thing with full speed and hardware raid5.
There are a lot of raid benchmarks at storagereview.com as well. IDE raid is so damn cheap.
--- Metamoderating abusive downgraders since my 300th post.
They also spec'd the motherboard as an "A7B266-D". I'm guessing this is the A7M266-D, as there is no A7B266-D (no one else is even considering manufacturing an SMP Athlon chipset besides the forthcoming Micron Scimitar)
It seems to me like this is a rather poorly thought out spec. Why are they using 4 FastTrak100 TX2s when they could use 2 FastTrak100 TX4s? Which of course brings up another point, why are they even using FastTraks? Under Linux the FastTrak driver is quite immature, and last time I used it only worked with 2.2 kernels, which hinders tbe ability to use filesystems like XFS. Also, the FastTrak cards are essentially software RAID as they offload the work of calculating the stripe locations onto the host CPU. There's no point in using md to combine multiple FastTrak arrays.
Many people were mentioning the 3Ware Escalade. It is a relatively good card, but for a home storage array Linux md + XFS might be a better choice. (Also note that the advantages of 64-bit PCI couldn't be had with the A7M266-D as it doesn't include any 64-bit PCI slots. Perhaps the Tyan Tiger would be a better choice for a 3Ware solution) My recommendation would be 3 Promise Ultra133 TX2 controllers. The read and write performance on an Escalade 7410/7810 is appaling. With the embedded processor on the 7450/7850 (R5Fusion Technology, as 3Ware calls it) the performance exceeds that of software RAID, but at the much more expensive price, of course. I think the goal here is bulk storage and not performance, and the ATA133 controllers are by far the cheapest solution.
For more information on IDE RAID under Linux, check out this site It's information is a bit dated at this point, but I used it for my home storage server and haven't regretted it. With 5 7200RPM drives on Promise Ultra100 controllers and Linux md RAID-5 w\ XFS, my bonnie++ scores are 90/30MBs for sequential read and write, respectively. I couldn't be happier. This site also has benchmarks showing the superior performance of software RAID over a hardware solution with a 3Ware card.
And there were a few other things people seemed confused about. No one in their right mind would put more than one drive per channel for the purposes of a performance RAID. That's just foolish. As for the limitation of being unable to access both the primary and secondary IDE channels simultaneously, this limitation was removed years ago with the introduction of EIDE.
In as far as everything else goes, I'm a SCSI bigot. I have SCSI drives in my workstations and I couldn't be happier. However, IDE RAID is a very economical solution for a home user, often with performance on par with that of more expensive SCSI RAID solutions.
To conclude, this article seems very poorly researched and documented. Had they actually attempted to build this beast and failed, perhaps I would've been more amused. However, as stands it's an overpriced specification which uses incompatible parts, and little research has been done on the optimum parts for the configuration.
Actualy, when the Human Genome first got online, I downloaded the thing as an 800mb zip file. Because I could. It was only a few gigs uncompressed. Unless you needed to store the whole genome for a couple people (rather then, say, diffs) current tech works fine. Hrm, a little odd knowing that the whole Human Genome is only about four or five times the size of a Divx movie.
autopr0n is like, down and stuff.
This would be great to run with FreeBSD 4.5.
A huge fileserver on the world-famous proven
BSD FFS filesystem.
The bandwidth is pretty good, but it's the latency that'll kill you.
autopr0n is like, down and stuff.
Ok. This is just inane. Why build this when someone has already done it better for cheaper?
http://www.raidweb.com
We purchase their 8 disk IDE RAID arrays. They are hot swap, support RAID 0, 0+1, 1, 3, 5, and hot spare, have dual failover power supplies, come with 64MB cache, which can be upgraded. Configurable via the EZ front LCD display, or via serial console. They support ATA-100, and ATA-133 coming shortly. Software upgradable, and it runs Linux.
They array (sans disks) runs us $3200. They even have versions that have dual fiber ports out the back.
WARNING - DO NOT purchase these with IBM GXP75 (75GB) disks like we did... we have about 80 of them that failed.
For God's sake, do not kill us! We surrender!
quoting from rw: Before dawn in Afghanistan last Thursday, US Green Berets launched a surprise attack on their unarmed allies, storming a disarmament depot with indiscriminate fire, then rounding up survivors only to tie their hands behind their backs with plastic bands and execute them. This according to that America-hating, propaganda-strewn leftist rag, The New York Times. God bless America.
(yes, I'm very much abusing my 50 karma account and spamming this message all over the place with a +1 bonus. People need to read it, ok? Read the fucking nytimes link. Thanks for your time.)
___
The way to see by faith is to shut the eye of reason. --Ben Franklin
I could be mistaken, but I didn't know that one could hot swap IDE devices. I thought they didn't really take kindly to you pulling them out of a running system. That means that you end up having to power down your system each time you want to take a backup home.
Therefore 57MB required per human
You still need indexing information. You need to spec where those diffrences occour.
autopr0n is like, down and stuff.
One base always matches up with the same one. Cytozine with Guanine (CG), Atozine with the 'T' one (AT) and the reverse (GC, TA). So you only need to record half of the pair.
autopr0n is like, down and stuff.
I just built a similar setup -- 500GB for less than $2,900. However, I made some different design choices.
First of all, I wasn't too impressed with the Promise controller, so the choice for me was between the 3Ware 7850 and the Adaptec 2400A. The Adaptec had the best overall performance, but the 3Ware is close and can support 8 devices. For the hard drives, I wanted to come reasonably close to SCSI performance, so I chose the WD1000JB drive with the on-board 8MB buffer. I used a Tyan Tiger K7 with 64-bit PCI for the motherboard with dual Athlon XP (not MP) 1700+ CPU's plus 1GB ECC registered PC2100 DDR RAM. Put them all in a nice aluminum rackmount case.
I'll probably replace the motherboard with the newer Tyan with 66MhZ PCI bus in the near future and use the current one in a workstation. I'll also drop in more RAM if/when prices drop.
It's been pretty sweet so far with LVM + XFS. My backup solution is a 33GB tape drive, so I spend most of every Sat. backing up the array. Time and money permitting, I'll build a second one and look for a DLT tape library on ebay.
Don't forget to sign up for the linux-ide-arrays mailing list. Just send a blank email to
linux-ide-arrays-subscribe@lists.math.uh.edu
Andrew Klaassen
I have my entire anime collection ripped for instant access, and I dont intent to ever stop this practice. (maybe if I get a hundred disc DVD changer.. but then what about VHS?)
I suspect that in coming years this will not be such an uncommon practice as to be called 'unreasonable'
just remember that sloppy, bulky, huge code, that's the wave of the future, and the only way some tasks will ever be carried out. 1TB-per-disc can't bee too far off, the public demands otherwise, it seems.
-- 'The' Lord and Master Bitman On High, Master Of All
and if you use software RAID via win2k
PLEASE do not ever used software RAID on a production file server! Esp. Win2k's implimentation of software RAID!
We use to run a software RAID on a file server (serving only 10 macs mind you!) - Both using 4x9 gig SCSI drives (a while ago); and 4 x 30gig IDE drives
Everything runs OK until you need to replace one of the drives; then the performance whilst rebuilding absolutly sucks!
I've seen the system take over 12 hours of production time to rebuild a 90 gig software RAID; all time performance for network users absolutly sucked!
The solution; good quality hardware RAID; we now run a compaq 5200 hardware RAID card; and all compaq drives: I can pull a HDD out right now; put a new one in and have the RAID re-built without any network user noticing....
How else can people store HDTV or full quality DVD movies, games, full quality music cds,
think about it.
If you use Linux, please help development of Autopac
I can't remember exactly right now, but Celera's storage was something like 100TB, wasn't it? Of course when you are actually doing the sequencing and annotation of the whole damn thing, you need more space. (of course they weren't using nearly all of it, and it also included stuff to service their "subscription" clients, each one of which would of course get a significant chunk to store their stuff...)
any one have more recent (or more exact) info?
sic transit gloria mundi
No, you still can't build really big servers. But you can slap 8 160GB drives in a box and drive them all at full speed, or just as close to full speed as SCSI controllers manage with 8 drives on a controller (PCI bus speeds being what they are). I've been using 3Ware gear for a year and a half - they work. For any system that requires 8 drives or less, there really is no reason to pay SCSI prices.
while 3ware doesn't advertise it, you can have more than one escalade in a case. I have seen people put as many as 3 full loaded 6800's in one case. 18*160GB=BIG storage (each controller has 8 drives, lose one for parity and one for a hotspare)
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
I've concluded this is a giant conspiracy theroy to convince people he has never said that.
I now want money. You all got money, wheres my money?! I'll say windows is better than linux for money.
Windows is better than linux.
Now wheres my money? I need to go buy a Slackware cd.
I'm a big retard who forgot to log out of Slashdot on Mike's computer! LOOK AT ME.
Over the years we have put so much of our lives on to the PCs that we would be seriously lost without the archive.
Paul.
You are lost in a twisty maze of little standards, all different.
Paul
You are lost in a twisty maze of little standards, all different.
When you want to recover something a browser lets you traverse the directory tree and tag the files you want. Then Amanda tells you which tapes to mount to recover them. Cool!
Paul.
You are lost in a twisty maze of little standards, all different.
But if you lose your data you lose your business, and no insurance is going to cover that. Years of work goes up in smoke.
Paul.
You are lost in a twisty maze of little standards, all different.
I think the mentality was: "Cancer bad--make the comment that makes think of cancer go away"
Mutation has everything to do with cancer.
Yeah, I know. Since the 640kB quote was said to be an urban legend, I thought I'd throw that in for fun. Wee!
Similarly I read something about a 1GB/hour VHS backup system about 5 years ago. With packs of 5 standard 4 hour VHS tapes costing about £5, that works out 25 pence a gigabyte - about half the price of cdr's.
However even the best video tapes will degrade very quickly compared with optical, computer tape systems, and even IDE hard drives.
As far as the 40/80 GB max on DLT, IBM and Compaq both offer larger backup solutions, LTO and Super DLT. Compaq has embraced Quantum's SDLT, which has a capacity of 110/220 GB and a transfer rate of 11 MB/sec (uncompressed). Search speed is roughly 4.5 meters/sec. IBM has embraced LTO, which uses 100/200GB tapes, has a transfer rate of 15 MB/second (uncompressed, and a >2x increase over 40/80 DLT at about 6 MB/second uncompressed), and has an on-tape chip which can hold an index of all the files on the tape for easier retrieval. The search speed on LTO is about 6 meters/sec.
Now, all of this is useless without being "generally available", so I did a little price-checking. Below are internal single-drive units (no autoloaders), and list price from manufacturers:
Compaq 40/80 DLT Drive (internal) - $3,499.00
Compaq 110/220 SDLT Drive (internal) - $5,590.00
IBM 100/200 LTO Drive (internal) - $3,999.00
Just wanted to point out that there are other options.
49 20 68 61 76 65 20 74 6F 6F 20 6D 75 63 68 20 66 72 65 65 20 74 69 6D 65 2E
Sam's clum in the area recently had a WD 100GB dide hard drive on sale for $120 after rebate. 1TB at that rate is ~$1320, plus a few hundred for the Motherboard, processor memory and extra controller cards, and a TB server is within reach of an 18-year-old who saved his paper-route money.
The real question is: how long will it take to listen to all those mp3s? At some point, extra storage just isn't practical because you can't fill it fast enough.
16 Pentium 100 base units - £160 16 100baseT nic - £160 16 60gb maxtor 7200rpm IDE ATA 100 HDD - £1920 16 port hub - £150 Base machine - Mainbord - £60 - 1.1ghz thunderbird - £70 - 1gb RAM - £180 - 20gb hdd - £50 Total base machine - £400 TOTAL COST - £2470 TOTAL STORAGE - 960gb Im a bit new to all this, but surely the above lot could work. If you made sure there was a good bit of ram in each of the doner pcs and used software raid you could access the whole lot like a huge drive couldnt you? It would allow the system to be redundant as well, as you could use 460gb storage, and have the other 460gb as an automatic backup. Take care all
Yeah it's cool they had to money to do this, but they made a number of STUPID Choices on hardware.
1 - I can see using the A7B266 because of it's 64bit PCI slots....but what use is the extra processor for? a 1Ghz Athlon is already Overkill for calculating raid5 parity information, no less two of them
2- If you sprang for the extra cash to avoid saturation, WHY THE HELL would you use a 10/100 NIC? No matter how fast the array is now, your sitll only going to be able to move 12~13MB/sec MAX. Either Save the cash and get a cheaper mobo, or pony up to an all-gigabit backbone.
3-2GB of RAM???? See Above. Given it's cheap...but even 512MB would be overkill.
4-Fasttrack 100Tx2's with Maxtor 160's. Did anyone tell these guys that ata100 only supports up to 137 GB per drive? These guys are wasing 184GB even before disk slack. Their "gigabyte array" is more like 900GB. Either buy the 120's and save some cash, or pony up for ata-133 controllers.
Slashdot actually published this? It's not particulary special, or even clever. These guys just had a bunch of cash to throw at hardware and didn't even take the time to do the basic research.Shame....Shame.
-Chris
--an unbreakable toy is useful for breaking other toys--
So, like, the total I get for his parts is 4920, but his total is 5720.
$800 bucks buys a lot of skittles and coke...
I can still remember the words of the Circut City guy.... "A 40MB hard drive is HUGE! You will never be able to fill that up..." And that was only like 6-7 years ago.
Me fail English? That's unpossible!
I'm no biologist but as far as I understand it to store the entire DNA data for one human being you would have to capture the current DNA in every single cell of the human body. As a lot of things damage our DNA from day to day can you say smoking, drinking, getting a sun tan. Granted 99 % or so of the DNA recorded may be identical but the other 1 % or so may be quite different. Doesn't sound like much but 1% of what, more than trillions of cells is still a lot of data. Yes your body does try to repair any damage however it doesn't always succeed just look for a mole somewhere on you, even a tiny one that's still a lot of cells. Now throw in storing positional information for each cell etc. And terabytes start to look like bytes.
;)
;)
NB: As I understand it our body essentially has a one level error correction when it comes to repairing DNA I think a cool use of nanotech when we eventually get there will simply be to improve on this, imagine you get your DNA sampled from a number of different places stem cells etc. So you know what the master copy should look like, then you have little nanites like ribosomes go around your body and help out, if they find a cell with DNA that doesn't match the master copy they repair it. Would help us live healthier longer lives. As far as I'm aware the only reason our bodies haven't evolved better error checking is simply because it doesn't need to as far as your DNA is concerned your just a vessel for procreation and you already live to about 70 that's plenty of time to pass on your genes. I think I'll stop now before I really get going
You can always use more space I'll bet the transport buffer for a real life teleporter like on star trek wouldn't be much good with a few terabytes , might teleport an eyelash or two
Disclaimer: All of the above is purely as I understand it, if any of it's off by a mile or two then go easy on the flames ppl.
I guess the simple truth is that now that 100 gig drives are a couple hundred bucks, we now have the ability to store anything we reasonably could need (unless you define "Reasonable" as "I need to store DNA Sequences").
Doesn't "640k ought to be enough for anybody" suggest that Bill Gates once felt the same way about RAM?
Of course, visionary that he is [snicker!], there's no way he could have imagined desktop machines being used to edit video.
Likewise, who knows how big and bloated Clippit The Office Paperclip can get if we have 100 gigs of hard disk space to burn... maybe, one day, he'll actually bear consultation when you need information, instead of when you need something to laugh at.
I love calculus so much, I want to give it to everyone! Come, get some integration!MmMMmmm... calculus. Hours spent in the dentist's chair, with him scraping hard crusties off my teeth... And you're just giving that stuff away?
Fire and Meat. Yummy.
I don't see it that way. I remember when 20GB drives were out and about. No one could have ever possibly filled those. :)
:p
More storage space simply means two things..
1. Bloated programs.
2. New and better media formats.
A pity we can't just get #2.
Why would you settle for DLT... When there is LTO (Ultrium) Techonology.
100 GB Tapes (Compression gives you upto 200GB)
you'd get a Full backup in about 10 hours on 4 or 5 tapes!
GenBank, the US DNA sequence repository, isn't at a tera byte yet. give it a year.
"There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy."
I'm surprised no one has mentioned this, but Promise has become more and more Linux-unfriendly lately.
There's different minor revisions of the 100Tx2 controllers; you can only tell by looking at the chip on board, I think only the last digit is different. I could not get the latest ones working with Linux at all. I ended up buying these boards under the Maxtor brand name (same units, but slightly older), which had the older chip set.
On the latest boards, it seems Promise appears to have intentionally made certain registers read only, thwarting open source driver development.
With that kind of behaviour, I'm staying away from Promise controllers, period. (I also had a hard time with their Raid5 controllers.)
Back when they were Linux-friendly, their ATA100tx2 cards were nice. But with the latest incompatible chipsets and no help from the company, forget it.
I also had some frustration with Adaptec's 2400 controller. It is *still* only supported by Adaptec under RedHat 7.0. And it has no audible alarm for drive failure, most annoying. Finally, under FreeBSD 4.3, it's performance was abysmal; there was definitely something wrong with the I2O driver working with this card. (I haven't tried 4.4 yet.)
For now, I'm just sticking with motherboard IDE controllers; far more tried and true.
-me
Love many, trust a few, do harm to none.
Imagine a Beowolf Cluster of THESE!!!
If you do store DNA sequencing information, make sure you only use lossless compression.
Or, for that matter, the issue for me is backup capasity again. With the advent of DVD-R (or whatever it's called today) I thought that "full backups" were going to be possible again. But now, with such vast quantities of data possible to have online and changing, backup issues again come to the fore.
Lossless compression helps, but now I'm stuck writing not 50% of a 4-Gig tape over the weekend, I have to write two or three full tapes.
As memory and disk space has become cheaper, bloat-ware uses more and more of it. I don't consider bloat-ware a good thing, but it cannot be fought any more than the monster shopping mall can be fought just because I happen to like mom and pop shops.
The difference between information and data, I guess. The next great invention I think will be the personal digital secretary, like the ones detailed by Daniel Keys Moran in his wonderful "books of continuing time", designed to sift through the impossible quantities of data yet still have the personal touch to say "Gee, that bit over there looks interesting. I think Bob would like that."
Bob-
The Ludwig von Mises Institute. The reasoning individuals economics
Actually, it's the land surface that's approx
100 x 10^12 sq metres (100T sq metres).
Using a rough radius of 6,378 Km & assuming the earth to be a sphere, the surface area of the entire earth is approx:
511.2 x 10^12 sq metres.
What to store ?
let's say color+height+simple usage byte.
color = 3 bytes
height = 2 bytes (16 bit signed int => MeanSeaLevel +/- 32,000 metres)
type = 1 byte
Storage = 6 x 511.2 TB = 3067 TB
( Aside:
ocean floor color ? yes - most geophysical / geological imagery uses 'psuedo-color' created from measurable surface properties.
)
And that's _*just*_ the 'instantaneous' surface, as humans we would be most interested in the surface +/- 10Km about sea level, projected both forwards and backwards in time.
The current batch od low orbit earth scanning satellites have instruments that (each) deliver approx 1GB data/hour & thats round the clock for the lifetime of the package (est 12 years).
It's an urban legend. For more info:
t es /gates_memory.html
http://www.urbanlegends.com/celebrities/bill.ga
For about $20-30, you can get disk drive drawers that turn a 3.5" drive into a 5" removable drive. Nothing active; it's just a bunch of mounting hardware. (About $20 for the part that stays in your machine and $10/disk for the removable drawer parts.)
This makes it easy to use disk drives as backup media, which is good, because they're much faster than tape. It also makes it easy to upgrade your disk capacity when you want to do that.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Fibre Channel hardware tends to be a little too expensive for the $5k crowd. This guy is using commodity hardware, and that generally doesn't include fibre channel. Even if he bought the hardware, the driver support for something like a SAN just doesn't exist in Windows/Linux/BSD yet.
I read the internet for the articles.
we killed it... it worked fine this morning but now its dead.
I have a large home network. and I already store all my CD's in mp3 and stream them to the clients. As well as 16,000+ books on line etc. Although the story was ok I have been working on makeing mine out of the cheep parts I have picked up from the dot com's that are going under. The local side walk sale it worth it.
If interested I have a page up about my network at http://www.xganon.com/~cl/network/
What do we do with this kind of storage?
How about recording audio files of your entire life!
The way to do it would be to carry around a small, lightweight device that would record each day's audio. The device would have 100BT ethernet and some custom software to connect it to a massive storage server. Some custom software would be written to transfer the files each night while the batteries are being recharged.
Aqcuiring Hardware
Pocket ePC-II System - $749
950g, 157mm (L) x 146mm (W) x 45mm (H)
PII 900Mhz, 128MB RAM, 10GB HD, 100BT Ethernet
4 Li-Ion Laptop batteries - $800
Mini Stereo Microphone - $65
Storage Required
128 kb per second for stereo mp3 at 128kbps
86400 seconds per day
1382.4 megabytes per day
504.576 Gigabytes per year
1TB would give you enough storage for 1 year, plus 1 complete redundant backup. Assume that every 18 months, the same amount of money will buy you a hard drive that is twice as big. Also assume that you throw away the old server every time you replace it with a new one, just so you don't end up with a huge pile of servers. I will start off with a 1.5TB server, which will last the same 18 months to make the math easier.
Storage Hardware
1.5 TB server
~ $8500
1st purchase at birth = 1.5TB, enough for 18 months
2nd purchase at 18 months = 3TB, enough for 3 years worth
3rd purchase at 3 years = 6TB, enough for 6 years
6 years = 12TB
12 years = 24TB
24 years = 48TB
48 years = 96TB, enough to last until you are 96 years old
So over your lifetime, you would have to make 7 purchases, for a total of $59,500. Add in the capture hardware of $1614, and you have a total lifetime cost $61,114. As the price of storage goes down, this will be even more affordable.
Soon, it will be feasible to do the same thing with high-quality video!
I've been backing up most of the servers that I administer onto a hard drive for at least a couple of years now. Typically, I simply scp a tar file to my backup server once a day. It's worked flawlessly, and my backup server can EASILY be offsite. This is my preferred backup strategy.
.-.--
The only thing I haven't seen anyone post about are power cables. I recently built a 1 TB server. It ended up costing about $3500. The problem I ran into was that while I had plenty of space in the Antec SX1240 Full tower case I bought, the power supply only has 6 power connectors. I'm currently in the process of figure out a solution, one of which is buy a second power supply and just use the extra connections on it. That will teach me to work without a checklist. Another issue to worry about are IRQ's. If your going to put 4 IDE add-in cards in there you'll probably get some overlapping (with windows at least). The solution I opted for was buying one of the relatively new abit boards that has 4 IDE controllers on it and one IDE add-in card.
You don't say 1.024k bytes, you say 1k bytes and expect the listener to know that about 1000 is exactly 1024 due to the context. If 1k bytes were always 1024 bytes, how would you interpret 14.112k bytes?
3/4" pipe is 1.050" Outside Diameter.
The 3/4" refers to an Inside Diameter of a pipe with a particular wall thickness (which may or may not still be made). Regardless of how thick the walls are, and consequently what the Inside Diameter really is, 3/4" pipe is 1.050".
IIRC there is something about a US bushel being a different volume depending on what is being measured.
And how much cheaper things get, 6 months ago a peice of kit like this would have cost you nearly, erm, $5000 [http://slashdot.org/article.pl?sid=01/07/19/15542 16&mode=thread]. Whoops
Here's a thought...
Split $DollarAmoutYouWantToSpend in half. Build two of these badass mofos. Give one to $Buddy/Work/'RentsWithBandwidth. Cron 'em to (r)sync over night.
I don't mind if I lose a day of pr0n.. Teehee.
Got a bit of info on the new 3ware product coming out in about April/May.
12 IDE ports on the card, basically just another cascade on board. It'll be 32/64-bit and be able to run at ATA133 with >128GB devices. Best part is the current drivers will(should) work with it!!!
At the rate HD prices are coming down and GB per device is going up... shouldn't be long before we see 10TB (with RAID5 + hot swap per 12 drives) for US $18K. That would be $1.80/GB for this kind of storage. And now that iSCSI is "working" with Linux... you could prolly use it as a SAN Storage server... Hmmm.... GB ethernet SAN.... hmmm....
Now, bring on an IBM 3494 Tape Library and 3590K Tape Drives and media... we all set. Of course let's not forget TSM (or what used to be ADSM) and a machine to to backups only.(1 Cabinet and 2 Drives, full of media $120K)
greg, REMEMBER ED CURRY!!!
I do. I use a script to tar important directories, email, programs I working on, etc, and burn them to a CD. Periodically I leave a copy at my brothers. Cheap, easy and knowing that I will not loose access to the source to projects I've done over the years for myself and clients is "priceless".