Slashdot Mirror


The Amazing $5k Terabyte Array

An anonymous reader writes: "Running out of space on your local disk? How about a Terabyte array for only a few thousand dollars. This article at KCGeek.com shows how to put together 1000 Gigs of hard drive space for the cost of a few desktop computers." I could rip my entire anime collection for instant access! Rip all my CDs and still have .9 Terabytes left! Maybe Mirror Usenet! I guess the simple truth is that now that 100 gig drives are a couple hundred bucks, we now have the ability to store anything we reasonably could need (unless you define "Reasonable" as "I need to store DNA Sequences").

160 of 448 comments (clear)

  1. I'm sure I'll figure a way to fill it... by DNAGuy · · Score: 4, Insightful

    Its only a matter of time 'til video becomes as commonplace as MP3's on our drives. 100 Gigs is what...20 movies??? I don't see my appetite for disk space slowing down any time soon.

    Hmmm...video; logfiles that don't roll over - ever; online network backup... I'm sure to figure out a way to fill that terabyte. :)

    --

    BRENT ROCKWOOD, EST'd 1975

    1. Re:I'm sure I'll figure a way to fill it... by mgv · · Score: 2

      Its essential that we move to this level of secondary storage, or there is a real danger that tertiary storage systems (such as tape, DVD) may actually be able to keep up!

      Seriously , the big problem here is not having the data online, but figuring how to recover it if you lose it.

      Not that RAID is a bad thing, but I have seen RAID systems go down - I lost a day's work (not archived by myself) when my web hosting company's raid system failed completely. (They were most apologetic and offered some compensation, but the data was very gone for all their customers - I believe they bought new RAID systems from another vendor immediately thereafter).

      My 2c worth.

      Michael

      --
      There is no cryptographic solution to the problem where the intended receiver and the attacker are the same entity.
    2. Re:I'm sure I'll figure a way to fill it... by BlackSol · · Score: 3, Insightful

      Redundancy is a better solution than disposable media backup. Often more expensive, but infinately more reliable.

      Code Versioning/document management on changing files to maintain history.

      Your web hosting provider had 1 Raid system, thats only 1 level of redundancy (I know multiple disks - but on 1 system). If you want to truely ensure data you need redundant systems, such as networked backup to 2 additional machines that also utilize raid.

      If the data is critical you need to examine points of failure. Thats what clustering, and load balancing offers - total redundancy.

      --
      $sig=$1 if($brain =~ /idea\s+(.*)/i);
    3. Re:I'm sure I'll figure a way to fill it... by Zeinfeld · · Score: 2
      Not that RAID is a bad thing, but I have seen RAID systems go down

      I once saw a shaddowing controller fail in such a way that it managed to corrupt both of the RAID 5 arrays it was driving. Had to bring the system back up from the first level backup.

      Soon after that we switched to using EMC gear.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    4. Re:I'm sure I'll figure a way to fill it... by Telastyn · · Score: 2

      What's the law?:

      "Any hard drive, no matter the size, will be 95% full in 3 months."

    5. Re:I'm sure I'll figure a way to fill it... by grazzy · · Score: 2, Funny

      well, dvds belong in a shelf, not on your harddrive :-)

      shelf: 20$
      storage: 100 dvds

      soo, basiclly you can store 100x7 gb on a shelf for 20 bucks. thats cheap!

    6. Re:I'm sure I'll figure a way to fill it... by 4of12 · · Score: 2

      Redundancy is a better solution than disposable media backup.

      This reminds me of a time when I was doing my PhD thesis.

      After working on the manuscript for the better part of a year, I began to get Beautiful Mind syndrome about losing it catastrophically due to fire, flood, etc.

      So I started to regularly ftp updated versions to a supercomputer site some 400 miles distant.

      Just in case....

      --
      "Provided by the management for your protection."
    7. Re:I'm sure I'll figure a way to fill it... by rpseguin · · Score: 2, Interesting

      > Its only a matter of time 'til video becomes as
      > commonplace as MP3's on our drives. 100 Gigs is

      > I don't see my appetite for
      > disk space slowing down any time soon.

      True enough. I disagree with Cmdr Taco's comment:

      "we now have the ability to store anything we reasonably could need"

      I used to say the same thing a while back, thinking I could never fill a disk. That was a 5M Sider drive for an Apple II...

      I just wish the stupid BIOS and drive manufacturers would get their act together on drive limits...
      Nobody will ever need more than 500M...
      Nobody will ever need more than 2G...
      Nobody will ever need more than 8G...
      Nobody will ever need more than 32G...

      How many times can you shoot yourself in the same foot with the same gun?

      > logfiles that don't roll over - ever; online

      That is a terrible architecture for storing log files... Makes them very hard to search, modify, ... You'd be better off creating a tool to iterate over a set of files for you.

      > network backup... I'm sure to figure out a way
      > to fill that terabyte. :)

      No problem there.
      A terabyte just isn't that much when you start to think of volumetric data, CFD, physics calculations, FEA, ...

      Personally, I'd really like to stop seeing all of this spinning media and start seeing solid state stuff with much higher densities...
      Frustrates me seeing people talk about 500 terabytes in a test tube. Forget that, just get the stuff working and tell me where to place my order for something I can use. :-)

    8. Re:I'm sure I'll figure a way to fill it... by susano_otter · · Score: 2

      Yeah, but I/O is painfully slow...

      --

      Any sufficiently well-organized community is indistinguishable from Government.

    9. Re:I'm sure I'll figure a way to fill it... by susano_otter · · Score: 2
      Convenience != laziness, I think... movies take up less space on your hard drive than they do on a shelf (what with individual packaging, easements for the shelf-unit, &c.)--and don't forget that a hard drive full of movies can be stored anywhere; a shelf has to occupy valuable wall space, include enough clearance to get to the movies in question, and be conveniently near the viewing device.

      Cataloging, indexing, and searching your copious movie collection is a lot less painful if you can eliminate the whole shelf thing as well. Jukebox issue aside, I'm sure you could come up with a similar list of reasons why having 1.5 weeks of music on your HD is vastly superior to having it on a shelf.

      Finally, a memo to the humor-impaired lobe of your brain: the I/O comment was a joke. See, I meant to highlight the absurdities of... oh, never mind. You didn't get the Matrix, you obviously wouldn't get this. But hey, at least you have a week and a half of random music you like to take your mind of it!

      --

      Any sufficiently well-organized community is indistinguishable from Government.

    10. Re:I'm sure I'll figure a way to fill it... by susano_otter · · Score: 2

      Indeed it does. Me, I'm quite satisfied with a cabinet full of movies and a PS2... Now all I need is a Lego Disc Robot.

      --

      Any sufficiently well-organized community is indistinguishable from Government.

  2. The Amazing $5k Terabyte Array by Anonymous Coward · · Score: 2, Interesting

    yeah , with 160 gig ATA drives out now,
    you can do it with 6 drives vs. 10 drives,
    and alot of motherboards come with onboard
    RAID, and if you use software RAID via
    win2k or Volume manager type app for Linux
    it would rock .

    Cheap too, at $260 per drive per pricewatch .

    Peace out...

    1. Re:The Amazing $5k Terabyte Array by Lumpy · · Score: 2, Offtopic

      please tell me how you get 6 IDE drives on a pc that gives you any performance in a rad function... U160 SCSI drives will give you at least a 70% speed increase and a 80% increase in reliability....

      If I had to store a terebyte of information I'd be an idiot to use consumer level storage (IDE).

      Ever wonder why real servers uses SCSI?

      --
      Do not look at laser with remaining good eye.
    2. Re:The Amazing $5k Terabyte Array by RazzleFrog · · Score: 2, Interesting

      Just get an IDE RAID card for less than $50. That frees up primary and secondary for DVD,Burner,etc. That is assuming of course that you have a free PCI slot.

    3. Re:The Amazing $5k Terabyte Array by RazzleFrog · · Score: 3, Informative

      I believe that Promise makes the SuperTRAK Pro series of ATA RAID cards that support up to 6 drives and RAID 5. I haven't used them personally but they do exist.

      I agree that on a server or a professional workstation SCSI is the way to go for speed and reliability. But for the home consumer who wants to work with digital video the cost of a SCSI RAID set up is extremely prohibitive.

    4. Re:The Amazing $5k Terabyte Array by mprinkey · · Score: 2, Informative

      on a server or a professional workstation SCSI is the way to go

      I do wish to avoid yet another SCSI/IDE flamefest, but I would point out that this configuration is like most of its ilk--it is basically network attached storage. That means that no one will be reading or writing from the server system itself, but will be accessing the raid array through a network link via NFS and/or SMB. In my experience, performance of Linux Software RAID5 on Promise IDE controllers with 80-GB Maxtor 5400-RPM hard drives can exceed 50 MB/s write and 70 MB/s read. SMB/NFS even over Gbit ethernet will be hard pressed to saturate that.

      Having built many of these low-budget raid5 arrays, I cannot concur that SCSI and/or hardware RAID is necessary to see acceptable performance. <Horror stories about Hardware IDE RAID5 controllers deleted.>

      I do admonish would-be builders to include an extra hard drive in the raid array as a hot spare. For four drive arrays (3 data + 1 parity), it may be unnecessary. For larger system (7 data + 1 parity), I think a hot spare is a worthwhile investment. Also, avoid 7200-RPM drives if possible and actively cool all of the drives in the array. One or two fans blowing on the array can make a big difference.

    5. Re:The Amazing $5k Terabyte Array by Junta · · Score: 2

      Well, the reliability is a point taken, one device fails on an IDE chain and the whole chain may collapse depending on the problem.

      The performance is questionable. IDE is behind SCSI, but not nearly as badly anymore. And how are you accessing this storage space typically? Through a network whose speed is likely not to exceed 100 MBit, so network is essentially the bottleneck for throughput, unless of course you have some data processing application or something like oracle, in which case your point may be valid, but still, SCSI is still overpriced for what it gives... If only it had prevailed more, then it would be cheaper and this debate would be moot...

      --
      XML is like violence. If it doesn't solve the problem, use more.
    6. Re:The Amazing $5k Terabyte Array by Lumpy · · Score: 2

      This is nice when taken as a over simplified example. but that is not the case. I can access all 15 of my scsi devices and tell them to do things seperately and they will perform the job. 2 devices on the IDE chain? if one is doing a job the other has to sit and do nothing. the communication system built into SCSI gives the largest performance gains... This is why a SCSI-II hard drive and controller still feels snappier than a Ultra 33 IDE drive... yes the IDE drive is theoretically faster than the IDE, but as soon as I access the CD and drive at the same time the SCSI devices cintinue to fly while the IDE devices start falling down waiting for each other.

      They really really need to design a IDE-II specification that gives the SCSI performance traits to IDE.

      --
      Do not look at laser with remaining good eye.
    7. Re:The Amazing $5k Terabyte Array by Cramer · · Score: 2

      While I agree, SCSI drives are simply better drives, cheap is a very powerful motivator. IDE is about one tenth the cost of SCSI. So the IDE array will last a year -- how long do a lot of companies last these days? Over time, the SCSI system may, ultimately, be cheaper -- the cost of replacing failed drives, the downtime for rebuilding and restoring the array, lost productivity of a missing database, drugs for the admin headaches...

      I've built a 1.04TB array. It's an impressive hack of a system. Out of the 16 drives for the array, four (4) were defective right out of the box! And two of those replacements were suspect. After a month of handling a full news feed (120G+ per day) we've worked most of the kinks out of it (I don't recommend w2k for a drive array.)

      BTW: I used a pair of 3ware Escalade (6800) controllers. They take alot of the suckiness out of IDE (tho' it's a cabling mess.)

    8. Re:The Amazing $5k Terabyte Array by Lumpy · · Score: 2

      Also, avoid 7200-RPM drives if possible and actively cool all of the drives in the array. One or two fans blowing on the array can make a big difference.

      This is important in any drive array. IF you dont have them spaced apart and have cooling fans on them, no mater what they are, you are asking for failures and short life spans. I was am azed at the differences on my SCSI drives, seperating them another 1/4th of an inch and adding a fan blowing at each bundle of 3 drives caused a temperature drop from 110DegF to 76DegF or only 4 degrees above the server rooms ambient temperature. (Hey, I'd rather have it set for 60 but the traffic and billing ladies complain..they keep their work area at 80!)

      The suprising part is that the ML530's have a spot to place a fan in the drive cages, yet no fans installed.

      --
      Do not look at laser with remaining good eye.
    9. Re:The Amazing $5k Terabyte Array by edmudama · · Score: 3, Interesting

      > They really really need to design a IDE-II
      > specification that gives the SCSI performance
      > traits to IDE.

      They already have it -- tag command queueing has been in the ATA spec for years, since ATA-5 I think. Most vendors either have command queueing IDE drives, or are coming out with them soon.

      http://www.t13.org for more info on the various ATA specifications

      --eric

      --
      More data, damnit!
    10. Re:The Amazing $5k Terabyte Array by jandrese · · Score: 2

      "Most vendors" assuming most vendors are IBM. AFAIK, the only company with CTQ on ATA drives is IBM (and this is on the ill fated DeskStar line). Additionally, the CTQ on the ATA devices is not as sophisticated as it is on the SCSI (shallow queues ar e common), but it should be only a matter of time before this is resolved.

      --

      I read the internet for the articles.
    11. Re:The Amazing $5k Terabyte Array by edmudama · · Score: 2

      Yes, you're right, and the drive is actually slower issuing queued commands than standard read DMA commands.

      Standard ATA queue depth is 32, which I believe is the same as SCSI, though there are side projects in ATA land to increase this because of potential performance gains. (more things you can choose from, better odds of being able to choose something easy)

      --
      More data, damnit!
  3. Actually by IAmATuringMachine! · · Score: 5, Interesting

    Actually a DNA sequence is only about 3GB for a human - you're anime DVDs might take more space, at least until you compress them. Then again, DNA should be fairly trivial to compress highly. Let Z = CA, Y = TG, .....

    --
    "Computer Science is no more about computers than astronomy is about telescopes."
    -E. W. Dijkstra
    1. Re:Actually by skroz · · Score: 2

      Why stop there? You could store four base pairs per byte with the most basic of compression schemes. You could probably compress it down much, much further.

      --
      -- Minds are like parachutes... they work best when open.
    2. Re:Actually by Quixote · · Score: 4, Funny

      Why stop there? You could store four base pairs per byte with the most basic of compression schemes. You could probably compress it down much, much further.

      But be careful with that compression thing! If you compress the DNA too much, you could end up like Minime

    3. Re:Actually by dNil · · Score: 5, Insightful

      You are correct that the human genome is "only about" 3 giga basepairs of sequence, but to only store that would be rather egocentric. There are as of Dec 3 2001 some 14396883064 bp in the GenBank, and the amount of sequence information still grows roughly in a exponential manner.


      Now, this will not hit the TB line anytime soon. The trouble starts if you are involved in genome sequencing. Then you need to store the raw data for all that sequence. Each some 450 bp of sequence is reconstructed from about 5 - 10 different fairly high reslution gel images (in the ballpark of 150 kBi per image). Also, recall that even short stretches of the sequence can be accompanied with a lot of annotating information, such as names and functions of genes, regualtory elements or pointers to articles explaining the experimental evidence for such. This mutiplies the storage requirement with quite a factor - nothing a neat little linux box with a huge RAID-array cannot handle though. Thats how we handle the sequencing data from Trypanosoma cruzi, by the way.

    4. Re:Actually by IAmATuringMachine! · · Score: 2, Interesting

      Good insight - I suppose I was just considering the cheap thrill at showing that it can be trivially halved, but no doubt if one is looking at base pairs alone they could probably compress it by a factor of eight. But the other poster was correct in observing that there is a plethora of other meta-information that goes along with it, such as what the various base pairs code for. Then again, if we wanted to be all GATTACA, they would probably do the simple compressed file (seemingly of a third of a gig) and the hardware would would decode it and calculate the meta-information for my insurance company.

      --
      "Computer Science is no more about computers than astronomy is about telescopes."
      -E. W. Dijkstra
    5. Re:Actually by IAmATuringMachine! · · Score: 2

      I made a grammar error (you're instead of your) - it must be that my compressed DNA didn't unzip properly.

      CATcoyboynealGTTA....

      --
      "Computer Science is no more about computers than astronomy is about telescopes."
      -E. W. Dijkstra
    6. Re:Actually by jpostel · · Score: 2

      Too bad there is no moderation for '+1 Corny'

      ;)
      .

      --
      Ummm, Jon, aren't you supposed to be dead...? - Otter(3800)
  4. "I need to store DNA Sequences"??? by Anonymous Coward · · Score: 2, Informative
    What is he talking about with the DNA sequences?

    human = 3 billion base pairs
    = 6 billion bits of data
    = 7.5e8 bytes
    = 7.3e5 kilobytes
    = 715 megabytes
    < 1 gigabyte

    Sure, lots of other life forms have been sequenced too, but most of these have much smaller genomes than humans.

    So how would you need a terabyte to store DNA sequences?

    1. Re: "I need to store DNA Sequences"??? by nick255 · · Score: 2, Insightful

      OK, I'm not an expert in this area, but I think when people do research into DNA sequences they get DNA sequences from a large sample of people so they can look for statistical links between certain gene sequences and various properties. Therefore they will need reasonablesamplesize*sizeofsequence, so if you have a sample of just 1000 people you could easily be getting into terrabyte land. (Then they need a highly optimised version of diff to spot the differences in the sequences!)

  5. Need for memory/storage by morie · · Score: 5, Funny
    I guess the simple truth is that now that 100 gig drives are a couple hundred bucks, we now have the ability to store anything we reasonably could need (unless you define "Reasonable" as "I need to store DNA Sequences"). slashdot

    Nobody should ever have need for more than 640 kB of RAM Bill Gates

    Simularities anyone?

    --
    Sig (appended to the end of comments I post, 54 chars)
    1. Re:Need for memory/storage by Chelloveck · · Score: 3, Funny

      Personally, I won't be satisfied until I have enough storage to catalog the quantum state of every particle in the universe.

      --
      Chelloveck
      I give up on debugging. From now on, SIGSEGV is a feature.
    2. Re:Need for memory/storage by Jenming · · Score: 4, Funny

      i store my DNA sequence. I actualy have lots of copies incase i lose some or it gets corrupted.

      --
      Morpheus, God of Dreams.
    3. Re:Need for memory/storage by Amazing+Quantum+Man · · Score: 2

      Are you certain about that?

      --
      Fascism starts when the efficiency of the government becomes more important than the rights of the people.
    4. Re:Need for memory/storage by SmittyTheBold · · Score: 2

      I keep many on-site, but I prefer my off-site partial backups. I don't have a real schedule yet, but ideally I'd make a back-up at least daily.

      The backup medium isn't as reliable/error-proof as one might hope, but it's all I have for now...

      --
      ± 29 dB
    5. Re:Need for memory/storage by omnirealm · · Score: 3, Informative

      From a Huntsville Times (Alabama) interview with Bill Gates:

      QUESTION: "I read in a newspaper that in l981 you said '640K of memory should be enough for anybody.' What did you mean when you said this?"

      ANSWER: "I've said some stupid things and some wrong things, but not that. No one involved in computers would ever say that a certain amount of memory is enough for all time."

      --
      An unjust law is no law at all. - St. Augustine
    6. Re:Need for memory/storage by Alsee · · Score: 2

      Including the ones in your storage?

      Of course. The best part is that that data doesn't need error checking. It is always correct, even if it gets currupted.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
  6. Cost Per MB by JohnHegarty · · Score: 3, Informative

    1 Terabyte = 1024GB = 1048576 MB

    $ 5,000 /1048576 is a price of $0.0047 a mb.
    Or another was $4.88 for a GB.

    Now who remembers when harddisks where more than $10 a mb.

    1. Re:Cost Per MB by Tim+C · · Score: 2

      Not quite - most (if not all) hard drive manufacturers define megabyte as 1000*1000 bytes, gigabyte as 1000megabytes, etc.
      (See, for example, the note on http://maxtor.com/products/diamondmax/diamondmaxpl us/QuickSpecs/42093.htm stating that "1GB = 1 billion bytes")

      Therefore, in *this* case, 1TB = 1000GB = 1000000MB, which puts the price up a little (although not much, I'll admit :-) )

      Cheers,

      Tim

    2. Re:Cost Per MB by JumboMessiah · · Score: 2, Insightful

      Yup, except that to a hard drive manufacturer:

      1 Terabyte = 1000GB = 1000000 MB

      Their marketdroids have a bad habit of rounding the values down and evening them off. This allows them to post bigger numbers on the actual size of the drive since dividing by 1000000 instead of 1048576 yeilds a larger end result.

    3. Re:Cost Per MB by Rupert · · Score: 2

      Shouldn't that be tebibyte?

      --

      --
      E_NOSIG
    4. Re:Cost Per MB by athakur999 · · Score: 2

      I don't think it's marketdroids messing things up, it's the techies. SI defines "mega" to be 10^6. Somewhere along the line, techies decided "mega" meant 2^10.

      --
      "People that quote themselves in their signatures bother me" - athakur999
  7. Nothing special. by Night0wl · · Score: 2, Interesting

    A terabyte isn't any thing special. But it's cool to see someone doing it. I was bored once one night. For a mere 36K you could, assuming you already own a Thunder K7 w/ the on-board SCSI pluss needed components, put together your self some really big storage. Using those 181GB Seagate SCSI drives.

    U160 and all of it churning at 10,000RPM. For a grand total of a few GB short of 5.5 Terabytes.

    But assuming you can affoard Thirty 1200$ drives you should be able to spring for a nice U160 SCSI RAID Card with an external connector ;p

    I couldn't even find a case with enough room for 30 hd's.... and I don't want to even think about cooling.

    But I wont have to worry about that. I can't even affoard a 9gb scsi drive at this point.

    --
    Computational Madness in a round package.
    1. Re:Nothing special. by millwood · · Score: 2, Interesting
      HP virtual disk arrays

      Heard a rumor that they may be considering support for IDE in something like this.

      --

      "Hello, World", 17 errors, 31 warnings
    2. Re:Nothing special. by loraksus · · Score: 2

      Cooling? Bah, it's winter, put a fan and two box filters on either side of it and replace the furnace :)

      --
      1q2w3e4r5t6y7u8i9o0pqawsedrftgthyjukilo;p'azsxdcfv gbhnjmk,l.;/
  8. 3Ware Escalade IDE-RAID by rdl · · Score: 5, Interesting

    I've been using these for a long time (6200 dual-port in hardware-mirror, up to the 8-port cards for large disk configs), and they're very fast and reliable. Cheap, too.

    $500 for an 8-port 64-bit RAID controller, looking to the host like a single scsi device per logical volume, seems like the best deal available. Along with a motherboard with sufficient slots for gig-e and these cards (easy to get 4 64-bit slots...maybe you can get more with 3-4 buses), and a 4U rackmount case with 16 drive bays, and you can have 4U of rackmount storage for $5k, too.

    I've been using setups like this for clients, as well as for private file storage (divx, mp3, backups, etc.), and know of people using them for USENET news servers (one of the most demanding unix apps for reasonably priced hardware).

    It goes without saying you want a journaled file system or softupdates when you have disks this size, and ideally keep them mounted read-only, and divided into smaller partitions, whenever possible. e2fsck on a 300GB partition with hundred of open files is painful.

  9. Great! Where's the backup solution? by danimal · · Score: 5, Insightful
    I would rather spend the money on good disk storage with an integrated or integral back-up solution. Why? Well, as cool as all that storage it, what happens when it goes *poof* and you can't get it back. You're screwed.

    Yes, this is a groovy/geeky/cool solution for under your desk, but at least spend the extra dollars for a SCSI card and tape backup unit. You could fit the whole thing on a few DLT's. You can also keep incremental backups to keep the tape swapping to a minimum.

  10. A much better article, also pointed to by /. by Thagg · · Score: 5, Funny

    Check out this article referenced by slashdot on July 20 2001.

    The nice thing about this article is that the people building it at SDSC really took extreme care in getting quality components that would work together to build a reliable, solid system, and still didn't spend more than $5K for a terabyte file server. In particular, the tradeoff of disk speed vs. power consumption was extremely insightful.

    I built one of these to their spec for my company, and I couldn't be happier. It's worked flawlessly since then. It's not clear if the Escalade boards are still available -- 3ware had said that they were discontinuing them, but they still appear to be for sale.

    thad

    --
    I love Mondays. On a Monday, anything is possible.
    1. Re:A much better article, also pointed to by /. by captaineo · · Score: 2

      Escalade originally planned to discontinue their IDE controllers, but due to public demand they decided to continue production...

    2. Re:A much better article, also pointed to by /. by felicity · · Score: 2, Informative
      It can actually be a bit cheaper:

      Promise FastTrak 100TX2 * 4 $500
      Maxtor DiamondMax 160GB Drive * 8 $3000
      Maxtor DiamondMax 20GB Drive $80

      You can get an Escalade 7850 for $550 or less, which is a single 64-bit card instead of the 4x Promise controllers. I don't know why there's a 20GB drive in there, maybe a boot drive? At $3k for 8 160GB drives, that's $375 each. Looking quickly at pricewatch, you can get the same Maxtor 160GB drives (5400RPM -- yuck!) for around $260 each. 8*160*(1000/1024) = 1250MB (actual MB) = 1.22 TB for a total of 550+8*260 = 2630 instead of 3580. Plus you have 3 PCI slots more than you had before.

    3. Re:A much better article, also pointed to by /. by jandrese · · Score: 2
      Be careful with those Promise controllers. Promise only supports 1 controller in a system. With the (not Fasttrak) 100TX2s, I can max the system out with only two contollers and 8 drives.

      BTW 5400 RPM has several side benefits for a design like this.
      1. You're likely saturating the PCI bus anyway, so anything faster is likely wasted
      2. 5400RPM drives draw less power (unless you're comparing them to 4800 RPM drives) than most other drives, alleviating strain on your power supply.
      3. 5400RPM drives generate less heat, and are easier to keep cool.
      4. Because they are running at lower speeds, lower performance drives tend to last longer than their equivelent faster drives (the 7200 or 10000RPM equivelent Maxtors), although this is highly dependant on the particular drive.

      Remember when you're planning on exceeding the design specifications of your system to account for all of the side effects, or you are likely to end up with fried power supplies and overheating prematurely dying drives.
      --

      I read the internet for the articles.
  11. Re:RAID by Dman33 · · Score: 2

    Not to be rude, but if you RTFA you would find out in the third paragraph.

    But to answer your question, it looks like an IDE Software RAID5.

  12. It's just IDE Raid 5 folks.... Move along. by RobL3 · · Score: 2, Informative

    I hate to rain on everyones parade (I really do). But this is just a typical IDE raid 5 setup with bigger disks. Not exactly slashdot worthy IMHO. If you're thinking about doing somthing like this, Raid Level 5 is not a bad choice if you don't need redundancy. For more raid info check out:
    http://www.acnc.com/04_01_00.html

    1. Re:It's just IDE Raid 5 folks.... Move along. by sluggie · · Score: 2

      maybe it's your definition of redundancy, but if one drive fails in a RAID5 array nothing breaks. Isn't that some kind of redundancy?

    2. Re:It's just IDE Raid 5 folks.... Move along. by Junta · · Score: 3, Informative

      What do you mean "if you don't need redundancy", the only RAID level that doesn't offer redundancy is RAID-0. RAID-5 can tolerate single disk failures, and if you do multiple levels of RAID-5 you can tolerate more failures (depending on how you configure it). The common configuration of RAID-5 with available hot-spares is quite sufficient in all but the most critical configurations, especially if it is a system that is closely monitored. Sure, you can build RAID-1 arrays of N drives where you can tolerate up to N-1 drive failures without problems, but for one space is used a lot less efficiently and for another write performance decreases for every extra level of redundancy you add, but that is overkill for most situations, the chances that multiple drives will fail simultaneously (or within a few hours of each other) is significantly remote compared to single drive failure probability.

      --
      XML is like violence. If it doesn't solve the problem, use more.
    3. Re:It's just IDE Raid 5 folks.... Move along. by Junta · · Score: 2

      Not to nitpick, but the parity computations only impact writes, not reads, on reads RAID-5 is essentially a RAID-0. a RAID-1 of RAID-0 does provide stellar read performance, even compared with RAID-5, and, admittedly, the write performance would also be better (same amount of data being written, no computation or additional reads required, as is the case with RAID 5 where you have to both read and calcualte before the write can even begin). Still, on my 100-base network connection, the system could do a lot and I would never notice...

      --
      XML is like violence. If it doesn't solve the problem, use more.
    4. Re:It's just IDE Raid 5 folks.... Move along. by foobar104 · · Score: 2

      Not to nitpick, but the parity computations only impact writes, not reads, on reads RAID-5 is essentially a RAID-0.

      Depends on the RAID controller. Some-- although damn it because I can't remember their names-- do parity on the read and on the write for an extra level of error checking. The idea is to catch bad data before it gets to the OS.

  13. How to use the disk space by MosesJones · · Score: 4, Funny


    1) "Compress" at a higher rate than the CD uses (I've seen this)

    2) Use POV Ray to render Lord of the Rings for the cinema

    3) Keep every src and every .o from every build you do

    4) Set the Linux swap space to be "500Gb" because you've upgraded the Kernel to the new VM stuff and it looks cool

    5) Install Windows XP+ in two years time, with Office XP+.

    Imagine that "Minimum Reqs: 1TB of available disk space"

    It will happen

    --
    An Eye for an Eye will make the whole world blind - Gandhi
    1. Re:How to use the disk space by MosesJones · · Score: 2


      1) It isn't compression its stupidity.

      2) Just because POV is slow doesn't stop a 3 hour movie at Cinema res using multiple terrabytes of disk space. Anyway Beowulf cluster POV for speed :-)

      --
      An Eye for an Eye will make the whole world blind - Gandhi
    2. Re:How to use the disk space by WNight · · Score: 2

      How much do you get paid to troll? Must be a good bit, or you must be really bored.

      I assume you've never installed Linux because then you'd know that of the 4-8 disks you get usually only two are asked for during install and even then the full install including apps and games comes to under a GB.

      Contrast that to Win2k. Sure, the OS is on one disk (and only takes 350MB or so) but Visual Studio is two disks, the MSDN libraries are on three disks (at least, probably more by now) and Office 2k is another disk (and takes about as much space as the OS.)

      When I've got a computer fully installed for use at work it takes about 4.5GB.

      This may be overkill for 99% of people, but then the extra six or seven disks of Linux are too. (And they contain similar stuff, extra utils, office suites, docs, source, etc.)

  14. Redundancy? by evilviper · · Score: 2, Interesting

    I'm sure some poor fool will do something like this, fill it up with data, then have ONE hard drive go bad, making everything practically useless.

    What we need isn't larger hard drive storage (not that it's a bad thing) we need more speed, and a cheap, gigantic & ultrafast tape backup system to backup all the data. Some PC designs that use better cooling methods would be very nice as well.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    1. Re:Redundancy? by sluggie · · Score: 2

      Read the article. it sais "RAID5". Do you know what RAID5 is?

    2. Re:Redundancy? by GigsVT · · Score: 3, Insightful

      In case you didn't notice, it's RAID5. One hard disk could go bad with no issues other than slowdown.

      They could also do what we did with our IDE TB. We used three RAID5s in hardware, each with hot swap. In theory, if they failed just right, we could lose up to 6 drives without losing any data.

      The three RAID5s are hardware RAID0ed together. The worst case scenerio is a simultaneous failure of two drives on the same array. But we saved so much money using IDE that we just built two complete systems for less price than SCSI. So really, we would have to hit the worst case scenerio twice at nearly the same time to have a total loss.. It gets less and less likely.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
  15. Done before by IceFox · · Score: 2

    In fact I remember reading somewhere about a year ago on the linux terminal page about how they put a tb server together for right around 4K I can't find the link, but if someone does please post. But grabbing the third largest drive (100GB) out there will save you a bundle and you still only need 10.

    --
    Do you changes clothes while making the "chee-chee-cha-cha-choh" transformation sound?
  16. Re:Great! Where's the backup solution? by drsoran · · Score: 2, Insightful

    That reminds me, I don't know where the hell the tape manufacturers think they're marketing to, but with 80 GB hard drives common now, it's rare to find a tape backup solution that is affordable for a consumer that can handle that much. By affordable I mean drives around $250 and tapes under $10/piece for at least 50GB of storage. I've seen some of the proprietary drives but the tapes cost almost as much as the drive! 5 or 6 years ago the backup drives available to consumers could handle backing up the entire average hard drive of the time onto a $15 tape (Travan), but now people are probably just doing without backups which is a disaster waiting to happen.

  17. 64 bit address space by QuantumG · · Score: 2

    pfft, these days people are demanding a terabyte of RAM.

    --
    How we know is more important than what we know.
  18. Re:Great! Where's the backup solution? by Greyfox · · Score: 2

    How about another terabyte array and rdiff? While Joe Average User probably isn't going to be able to afford to do that, he's probably not going to be able to want to build the first one either. If you're a small to medium size company, it'd probably be worth considering. I think by the time you start talking this price tag, you'd be considering some of the mainfraime storage companies for DASD and backup though. IBM's 2105 "Shark" machine will go larger than 11TB now, IIRC, and I'm sure the other "big iron" shops have similar solutions.

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

  19. 2TB for $8300 by GigsVT · · Score: 5, Interesting

    Inspired by Slashdot's earlier story that was nearly identical, and with the help of Peter Ashford from ACCS, we built two servers, both with capacities well over a TB, for around $8000 each. They have the capacity to expand to 3TB if need be.

    Story here

    As far as performance:
    (from my memory)
    EXT3: About 16MB/Sec block write, 45MB/sec block read
    ReiserFS: About 20MB/sec block write, 130MB/Sec block read (that's no typo).
    XFS: About 30MB/sec block write, 85MB/sec block read.

    It seems that file system plays a large role in performance. The arrays are three RAID5 in hardware using Linux software RAID0 on top of the RAID5 arrays to tie them together.

    IDE RAID controllers are 3ware Escalade 7810. Write performance can be greatly increased by using 7850 cards that have more cache.

    We stuck with XFS, Reiserfs had a bigfile bug, files created over 2GB would lock up the computer basically. XFS in general seemed much more mature, reiserfs seems more like someone's college thesis project, that they never cleaned up to be production grade.

    We experimented with different RAID0 stripe sizes, the hardware RAID5 stripe size is fixed at 64k, there are 7 active disks in each array and one hot spare. Stripe size tweaking seemed to mostly trade off read for write speed, within a certain range of values, with a taper off in performance at either extreme, (down around 8k stripes, or over 1024k stripes)

    We eventually went with 1024k stripes. That is what the benchmarks above reflect. The variance in file system performance could very well be due to interactions with stripe size, but there seemed to be common themes (reiser always read fastest no matter what stripe, XFS was always better at writes)

    I have been in so many arguments with SCSI zealots on here over this RAID... I wish people would understand what price/performance ratio means. IDE isn't a superior technology, but every now and then, it is the right tool for the job, when price is a goal too.

    --
    I've had enough abrasive sigs. Kittens are cute and fuzzy.
  20. Uhm, redundancy in posting? by VWswing · · Score: 3, Interesting

    Is this any more special than the last time
    slashdot announced an amazing terabyte arrayHere

    Seriously though.. People's numbers are pretty far off. This can be done for about 3000.. Pricewatch
    has 160 gig drives for $259 .. 10 of these would give you over 1 terabyte in useable space in raid 1.. Or if you just cared about write performance, 6 of them for $1554 would give you a terabyte of useable storage.. another $600 to throw together a cheap pc and cheap ide raid cards.. you get it for under $2500.. big deal.

    Lately I'm realizing how awful IDE really is.. I finally got around to throwing 2 36 gig ultra 160 drives on my box with an adaptec scsi card, running ext3 on top of a raid mirror.. more space than I need (I just keep all my mp3s on an IDE raid.. since my dragon motherboard has ide raid built in).. Since I've gone to scsi life has been happy. I can do things while compiling, while vacuuming my db, etc..

    Funny how mac used scsi before the rest of us, huh?

    --
    "And how can this be? For he is the ..."
    1. Re:Uhm, redundancy in posting? by Junta · · Score: 2

      And now Mac uses IDE... I use IDE too, and it is pretty clearcut to me that while SCSI drives offer better performance, IDE drive offer better bang for the buck. Recent IDE drives are nearly indistinguishable (to me) from SCSI drives in terms of desktop application performance. Now when you are building a raid array with, say, a high-load database on top of it, then yes, SCSI will be worth it. On the other hand, if it is a single user workstation, I don't really see much of a reason to go to SCSI. In fact, I'm currently building a network file server with raid with no more than 3 or 4 users over a 100-MBit connection concurrently, and I see no reason to use SCSI raid over software RAID on IDE, since SCSI increases the cost dramatically and offers little perceivable benefit...

      And as far as this story is concerned, the array I plan to build happens to cost about 1100 dollars and provides 480 gigs of storage (160 more is redundancy, so if redundancy wasn't an issue, it would be 640 gigs, and as we all know, "640 Gigs ought to be enough for anyone"), so a $5k terabyte seems a bit steep when you think about it.

      --
      XML is like violence. If it doesn't solve the problem, use more.
  21. YKYHBRSFTLW by Graabein · · Score: 2, Funny
    You know you have been reading /. for too long when...

    The first thing that runs through your mind when you see the above headline is: "Wow, imagine a Beowolf cluster..."

    Argh.

    --
    And remember kids: Never trust a computer you can actually lift.
  22. 4 Controllers for $500? by sluggie · · Score: 2

    Why not snap in a Promise SX6000 for like $250?

    This neat piece DOES hardware RAID5, so you don't need a fast cpu&mobo, less RAM, and since it can only manage up to 6 drives you can even have 2 as pseudo hot spare...

    The only drawback is the ability of "only" storing 800GB which is nice at this even cheaper price...

  23. Why use expensive online storage? by JoeShmoe · · Score: 3, Interesting

    Aren't these types of systems more for archiving massive amounts of data than actively working on it? I mean, how much data can a computer actively process anyway? Wouldn't a 100GB drive meet just about any processing demands (genome tracking, video editing, etc)?

    Why not use slower but MUCH cheaper offline storage? I really like the design goal of

    http://www.dvdchanger.com/

    You can easily get 1TB of storage with such a device for less than $1000. True, only one person can access it at a time but that is only because PowerFile wants to charge more for so-called "networked version".

    In theory, if someone could figure out how to build on of these things, you could throw in a two or three CD/DVD drives for accessing and a 20GB hard drive to buffer images. Boom. Now you have the perfect storage backbone for a house-wide media center. I just wish Linksys or someone would throw a linux thinserver onto of the PowerFile hardware and get me something cheap and network-ready.

    - JoeShmoe

    .

    --
    -- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
    1. Re:Why use expensive online storage? by tsangc · · Score: 2, Interesting
      Wouldn't a 100GB drive meet just about any processing demands (genome tracking, video editing, etc)?

      No. Using a very low video data rate (ie, DV25), you're looking at 3.5MByte/sec. That's only eight hours of video. No one captures only what they're going to use, since that's the whole point of editing--you take all your material, capture some of it, and cut that selection down to a final product. So if you're making a 1 hour production, you might very well have 10-20 hours of footage if not more. Of course, you'll selectively digitize with a batch capture deck, but still...

      And of course, editors might have multiple projects running simultaneously. And most secondary media devices are too slow to restore material to your primary disks.

      DV25 is only suitable for consumers, event videography (weddings etc) and industrial work. Go to DV50 and that datastream is doubled-DV50 (ie, DVCPRO50, DigitalS) is minimum level for broadcast work.

      Now, raise that for HD editing requirements. Sony's HDCAM, a highly compressed solution, runs at 140MBit/sec, uncompressed HDTV is above 996MBit/sec (and that's downsampled and cropped).

      So no, 100GByte is really only scratching the surface of what video editing requires.

      Calum

  24. Re:Great! Where's the backup solution? by Lumpy · · Score: 2

    If you're going to spend $4K on a DLT drive, spend $8K on a DLT tape library that holds 10 DLT's plus 1 cleaning tape and forget about it. Sure it's only 700Gig of backup but you can always compress.. Otherwise upgrade to a 20 DLT tape library box and call it done.

    --
    Do not look at laser with remaining good eye.
  25. Re:Nothing special - bullsh*t by chill · · Score: 2

    1. SCSI can handle between 15-30 devices on one good controller, *not* including support for multi-disk changers via LUNs. Most IDE can handle 2-4.

    2. SCSI drives don't turn into slugs when you access more than one at a time. IDE does. Want to see it REALLY screw up, access an ATAPI CD-ROM slave the same time as a HD Master on the same controller.

    --
    Learning HOW to think is more important than learning WHAT to think.
  26. Why not firewire? by weave · · Score: 3, Insightful
    Maxtor now has a 160 gig external firewire drive. You can chain 62 of these puppies." Screw terabyte, think petabyte.

    I figure this is the easiest way to add as you grow without having to break open the case and try to figure out how to add another damn drive in there. For backup, just have two systems with identical capacities and rsync between the two nightly.

    RAID is nice, but for home use, it's not as nice as a nightly mirror. Why? I've seen RAID controllers fail and take out an entire RAID set. RAID also doesn't deal with the "Holy shit, I just accidently type `rm * ~` instead of `rm *~` problem."

    1. Re:Why not firewire? by Ixohoxi · · Score: 2, Informative
      160 GB * 62 = 9920 GB = approx 9.9 TB
      9.9 TB = approx 0.01 PetaByte

      Don't hold your breath thinking about petabytes.

      Also, RAID isn't for people who make stupid mistakes. Sorry about your 'rm' debacle.

      --
      What's a second? An hour? A day?
      It has much more to do with
      the Earth's rotation than with cesium.
    2. Re:Why not firewire? by weave · · Score: 2
      Don't hold your breath thinking about petabytes.

      Thanks for correcting my stupid math error. I have no idea why that warrants a Flamebait rating. :-(

      Also, RAID isn't for people who make stupid mistakes. Sorry about your 'rm' debacle.

      I know. Bingo. That's why one still has to have a decent backup system in any environment. Users (even administrators) make stupid mistakes.

    3. Re:Why not firewire? by weave · · Score: 2
      2: With their setup, you would probably have greate aggergate bandwith. Fire wire: 400Mb/s (bits) and, we are talking about potentially reading and writing at about 60MB/s (a bit over 400Mb/s eh?), but to really use it as a NAS, you would have to have either gigabit enet, or a 5 or so 100Mb enet ports to that server, and a nice managed switch to hook it all to.

      This idea is for a home, not corporate environment. You'd only have a handful of clients, and probably never more than one reading from the mess at once. Also, it's mainly for a digital data store for movies and mp3s and pr0n. That data is only accessed mainly one at a time. Can it despool fast enough to drive your vid player?

      As for backup, I was talking about rsync, not software mirroring. Adding data to this jungle would not be done at high volumes. Move a DVD or two during the day and download your 3GB newsgroup limit and you're talking about 10 gigs to go to the primary unit and then 10 gigs over to the backup unit overnight or something.

  27. Sorry, I just couldn't resist... by jonr · · Score: 2, Funny

    1 Promise 6 channel PCI ide raid controller, 99$US.
    12 Maxtor 160gb ata133, 270$ each
    1920gb of Pr0n and other goodies, priceless!

  28. Re:Great! Where's the backup solution? by Paul+Johnson · · Score: 5, Insightful
    Absolutely. And to those who say "Just build another one" / "RAID doesn't need backup", I have only one thing to say:

    FIRE!

    Any serious data store needs to include a backup system which allows for copies off-site. Fire is the obvious risk of course, but floods, vandalism and lightning strikes are all possibilities.

    AFAIK the only generally available tape backup for something this big is DLT, which IIRC can now do around 40GB per tape before compression. With the 2:1 compression usually quoted thats 80GB per tape, or around 13-14 tapes for a full backup. So you really need about 30 tapes for a double cycle, and maybe more if lots of the data is non-compressible (like movies). But this stuff ain't cheap. DLT drives start at around £1000 and the tapes cost £55 each. So thats around £2500 = $4200 to back this beastie up.

    Having said that, the possibility of using hot-swappable IDE drives as backup devices is intriguing. Just point your backup program at /dev/hdx3 or whatever. One big advantage is that if your tape drive gets cooked in the server-room fire you don't have the risk of tapes that can only be read on the drive that wrote them. A Seagate 5400RPM 60GB drive costs £110, which is only a third more per megabyte than a bare DLT tape. Two cycles-worth of backup (34 drives) would be £3,700. And you can probably do better by shopping around. For servers with only a few hundred GB on line this might well be more cost-effective than buying a DLT drive.

    We use Amanda to do backups here. Its a useful program, but it can't back up a partition bigger than a tape. So you need to think carefully about your partition strategy. (Side note: you can use tar rather than dump to break up over-large partitions, but its still a pain).

    Suddenly that terabyte starts looking a bit more expensive.

    Paul.

    --
    You are lost in a twisty maze of little standards, all different.
  29. Re:Great! Where's the backup solution? by ViceClown · · Score: 2

    Amen to that. I was looking for a backup solution for my 60 gig server a few weeks ago. Know what the most cost effective solution turned out to be??? Another damn harddrive!

    --
    Have a Happy.
  30. Oops! by Paul+Johnson · · Score: 4, Interesting
    Sorry, I just noticed a thinko in the discussion of IDE drive costs. The DLT costing assumed 2:1 compression. The disk cost didn't. Assuming compression we can squash 120GB onto a 60GB drive, requiring only 9 drives for a full backup, and 20 drives overall (a couple of spares is always a good idea). Thats £2200 for IDE backup, which is actually cheaper than the DLT solution.

    Does anyone out there actually use IDE drives like this? It seems a pretty obvious thing to do.

    Paul.

    --
    You are lost in a twisty maze of little standards, all different.
    1. Re:Oops! by cuyler · · Score: 3, Informative

      If your serious about backing up that much data you could also use a 9840 drive which holds 20gb uncompressed and (they say) 80 gb compressed however in my experience you can get 140gb onto a tape. Also, it'll write faster (when backing pu a terabyte having the backup take 32 hours is not a good idea). The 9840B drives write at up to 50gb/hour but usually run closer to 30-35gb/hour. While DLT drives usually write at about 5gb/hour.

      I haven't tested it out but StorageTek has a drive called the 9940 which has tapes that hold 60gb uncompressed (likely 200+gb compressed), it writes faster (10mb/sec ~= 55gb/hour). Also, the drive itself will put you out $33.5k with the tapes being a couple hundred a piece.

      In this case, it'd probably be better just to have a second 1tb raid - then again tapes are much more stable.

      -Cuyler

  31. Re:Inquiring minds by Tazzy531 · · Score: 2

    The requirements specify 4 PCI RAID controllers. Each of these could potentially handle 4 hard drives. I'm assuming that he's only putting 2 on each so that it doesn't come across the problem of accessing 2 drives on the same channel. In addition to this, there are 2 more on the motherboard, that I guess he isn't using. Secondly, these cards are bootable. So any one of them can be set to boot from and you can boot from any drive. But I don't think he is doing that because he has an additional 20 gig drive that I'm assuming is going on the motherboard. That is where the OS is going to be installed.

    Go here for the datasheet

    --


    _______________________________
    "I'm not Conceited...I'm just a realist..."
  32. Tape is better for backup... by morzel · · Score: 3, Informative
    ... for the simple reason that the mechanism (eg: DLT-drive) and datacarrier (eg: Tape) are separated. IDE disks have both in one sealed package, which makes it terribly difficult to get to your data if your stepper motor borks.
    With tapes, you just get a new drive.

    --
    Okay... I'll do the stupid things first, then you shy people follow.
    [Zappa]
    1. Re:Tape is better for backup... by realdpk · · Score: 2

      I've heard horror stories about tape heads being differently aligned, causing any tapes written to with older drives useless. But then again, I've also seen IDE drives go bad in an array such as one of these (this one is 16x100GB, adequately cooled).

      Given that, I prefer using a two-layered backup approach - disk-based as the front-end, for high speed backups and restores, and then back the disk-based solution on to tape. Btw, when doing this, I don't think compression will count at all since the backups are already compressed. Is that right?

    2. Re:Tape is better for backup... by jpostel · · Score: 2

      On the compression question:

      Some (all new) DLT drives use optional hardware compression. This is important because regardless of how much compression is achieved on disk or through software, the DLT drive might compress it more.

      The compression on hard drives is done in software since it would be done by the backup program or the OS.

      --
      Ummm, Jon, aren't you supposed to be dead...? - Otter(3800)
  33. DNA Sequence by zmokhtar · · Score: 2, Informative

    FYI, the DNA sequence isn't that big. The National Human Genome Research Institute has their 90% complete draft burned on a single CD.

    --
    Why aren't we told when editors moderate our posts?
  34. Re:RAID by Sepper · · Score: 3, Informative

    Actually, if you did read the article, you would find that the proposed systems is build on ATA100 supported by RAID5 software... which mean that the last of the 8 160GB drive, would be used for parity and that leaves *ONLY* (7*160GB)/1024= 1.09375TB! Now, i know that hardware RAID5 is expensive, but just think for a second: you would have hot-swapable secure-as-long-as-only-1-hard-drive-fail personnal massive-and-fast storage system... A dream system :)

    --
    I live in Soviet Canuckistan you insensitive clod!
  35. Re:Great! Where's the backup solution? by jd142 · · Score: 2

    True. But one thing I haven't seen yet is the fact that most backups aren't full backups. You do a full backup maybe one a month or once a year. Every other backup is a diff only. So while the initial backup may take several tapes, the nightly backups shouldn't. At least on the type of system where the data is basically the same from day to day, which was the point of the article.

    Plus, as described in the article, where the point was to have a singe hard drive based storage for dvd's and cd's, if there was a drive failure, you could just take the original media and do the rip again. Annoying yes, but doable. You haven't lost data unless the fire burned down your house and melted the cd's at the same time it took out your storage. That's why companies buy fire safes and use off-site storage.

  36. thousand hours of video? by peter303 · · Score: 3, Interesting

    Video is the most bulky storage people would save. How much would people want to save for re-viewing? First you have the time-shifting stuff like TiVo/Replay- perhaps a few tens of hours at most. Then you would be your favorite movies and TV series. As video-phone improves you might be saving some hours of friends and relatives video conversations. With infinite storage, the constraint becomes need and time to view all that stuff. And you'll probably be wanting to spend your time looking at new stuff. So I'd guess most people's real needs would be hundreds to a thousand hours. At 1-2 BG per hour, your talking about a terabyte or two.

    I don't include the argument that you'd have trouble finding old stuff. Computer software is more clever at organizing things - far better than material storage. A good recent example of this is Apple's "iPhoto" that much more convenient for organizing thousands of photos than physical albums.

    1. Re:thousand hours of video? by gorilla · · Score: 2

      Depends a lot on what your 'favourite series' is. There are several which have been on for 30 years or more, and amassed thousands of hours of footage or more.

  37. Home File Server Appliance by rlp · · Score: 2

    This would be great for a home file server. Many new homes are being built pre-wired with CAT5 (alas not my old house). Just add a big file server in the basement. With proper wiring, it can act as an answering machine / PBX, personal video recorder, music (MP3) repository, mail server, file server, etc. With RAID, you have less worries about a drive crash wiping you out (though you'll need a disaster recovery plan - flooded basements would be real bad). I've always wanted to do this! Main stumbling block is getting CAT5 wiring from the second floor (where my computers reside) to the basement.

    --
    [Insert pithy quote here]
    1. Re:Home File Server Appliance by zitsky · · Score: 2, Interesting

      I had a similar problem when I bought a house last year. I had a converted garage that I wired for ethernet, and even ran ethernet into the basement. However, I didn't want to install ethernet jacks in the house, as it's about 100 years old, and I didn't want the hassle.

      I settled on using 802.11b wireless to communicate between the house and the office. I know all about the security problems (my address is....) but maybe the newer 802.11g or 802.11a might work for you.

      I have some workbenches in the basement that are about 4-5 feet off the floor. I'm going to install a file server and leave it on one of these benches.

      It's cold and damp down there in the winter. I don't know how well the equipment will take to the humidity. I guess I'll find out!

  38. Use "MPEG": people differ by 0.1% by peter303 · · Score: 2, Interesting

    At the most people's genomes differ by 0.1% from each other - much less than that if you are relatives. Therefore you'd record the differences, sort of like several of mpeg algorithms.

  39. Ouch! $160GB disks! by jandrese · · Score: 5, Insightful

    Ironically, I just built something very similar to this a few weeks ago (it runs great BTW), but I spent <$1500US on all the components. The biggest thing you have to watch out for is the Hard Drives. I went for the ones with the best bang/buck ratio at the time (Maxtor 80GB 5400RPM drives). This let me build a system with well over 1/2 a Terabyte of usable space at a fraction of the cost. Additionally, the slower drives require less power and less cooling, making them easier to fit in a standard full tower case with a merely beefy (as opposed to server-class) power supply. I think the processor requirements he stated were a little overboard as well. I've found that disk access tends to be limited by the PCI bus (it doesn't help that I used an older motherboard with 33 Mhz 32bit PCI), especially on writes where you can spread data across the write cache on the drives. Be careful when you build an array like this, ATA *hates* having access to both a master and a slave drive at the same time. Be sure to avoid having two disks on the same plex on the same controller. This was natural for me fortunatly, since I was building two plexes, a "backup" and a "media" plex.

    A final word of warning: Promise ATA100 TX2 controllers may look like a natural choice for a server like this, but they only support UDMA on up to 8 drives at once, and Promise's tech support only supports a maximum of 1 (one!) of their cards in any system.

    --

    I read the internet for the articles.
  40. Mirror Usenet? by Matey-O · · Score: 5, Funny
    Maybe Mirror Usenet!
    Well, exclude the binaries and I can mirror USENET on my Palm III!
    --
    "Draco dormiens nunquam titillandus."
    1. Re:Mirror Usenet? by Cramer · · Score: 4, Informative

      A full USENET news feed (everything one can find) will exceed 120GB per day. It'll almost fill a DS3. (And we were receiving a "crappy" test feed from UUNet.) So, minus @alt.binaries.*, one could mirror USENET for a few years. With the binaries, it'll hold you for about a week, 2 at the most.

    2. Re:Mirror Usenet? by Matey-O · · Score: 2

      Well, yes. That was me using a little Artistic License. a) A Palm III has _2_kb RAM, and b) it subtly inferred that 'aside from pr0n and the Simpsons, there isn't much happening on USENET.'

      --
      "Draco dormiens nunquam titillandus."
    3. Re:Mirror Usenet? by Matey-O · · Score: 2

      Sorry. The Newton had 2 kb of ram....or was it 16kb? I'm getting too old to remember machines that had too little memory. :)

      --
      "Draco dormiens nunquam titillandus."
  41. 3ware is Painful! by purduephotog · · Score: 2

    Actually, I assembled a 600 gig storage device using the afore mentioned 3ware controller.

    First, there were hardware bugs and they recalled the controller

    Second, 3ware dropped the product line, but vendors were still telling me it was available.

    Third, they brought it back, and I had to get a drop ship

    I lost about 3 months on design phase due to this little tidbit.

    Now don't get me wrong, it's working now and seems reliable... but... there's always this nagging suspicion that something is going to go wrong and I'll lose all that data.

  42. But it can't fit inside a lego brick!!!!!!!!! by heroine · · Score: 2

    Get rid of it.

  43. RAID ain't "secure" by -brazil- · · Score: 2, Informative
    First, you mean "safe", not "secure". Second,
    extend that to safe-as-long-as-only-one-hd-fails-and-you-never-ev er-make-a-mistake. RAID only gives you high availability, but it is by no means a substitute for backups: when important data is deleted by a virus, or accidentally because of a user mistake, even a RAID 1 will just dutifully mirror the deletion.


    Always remember: data that is not backed up might as well not be there in the first place!

    --

    The illegal we do immediately. The unconstitutional takes a little longer.
    --Henry Kissinger

  44. Drive Failure Notification Problem by jayrtfm · · Score: 2, Interesting

    With Raid5 a single drive can fail without causing dataloss.

    How do you know WHEN a drive has failed?

    With the low end IDE RAID cards your notification comes when the 2nd drive fails......

    3Ware's website describes a SNMP monitoring utility for windows, but didn't specifically mention Linux support. Ditto for Adaptec.

    If the raid is done in software, is there a linux program to monitor and notify when a single drive goes down?

  45. Re:RAID by Ioldanach · · Score: 2, Insightful

    So, then, I'm confused... He's trying to use software raid, but he has 4 Promise FastTrak 100TX2 raid controllers. WTF? First off, each of those cards supports 4 drives on 2 channels... Why does he need 4 cards when he only has 8 drives? He only needs 2 cards. Second, why is he using expensive raid hardware (that doesn't even support RAID 5) when he's using software raid!?

    All he needs are two of maxtor's cards, which you can buy packaged with the drive for an extra $13. Not only that, but his prices on hard drives are way too high. 8 drives (2 with maxtor's ide cards) are $2122, per pricescan.com. Since he lists $500 for the ide cards and $3000 for the HDD's, that's a savings of $1378.

    Then, he quotes $500 for 2 GB of ram. At $70/.5GB sticks that's $280. $500 for a case??? Try $365.

    That said, the $5720 price he quotes is high by $1733. You could build one of these for just under $4000.

    Ok, I admit, I didn't include shipping.

  46. 3ware and external raid devices by FreeUser · · Score: 3, Informative

    please tell me how you get 6 IDE drives on a pc that gives you any performance in a rad function...

    I don't know how he does it, but I have personal experience in doing it two different ways:

    1) 3ware IDE RAID controller, has 1 IDE controller per drive on the card (i.e. 8 ide controllers), which the firmware maps to a RAID Device. Depending on the RAID configuration the drives appear as one large SCSI drive to the system.
    Performance is on par with SCSI.

    2) External IDE-SCSI Raid chassis. Again, 1 IDE controller per hot-swap drive, appearing to the system as one or more big SCSI drives, controlled by a standard SCSI controller. Speed and reliability have surpassed that of a $60,000 SCSI solution sold by Sun I happen to have lying around.

    U160 SCSI drives will give you at least a 70% speed increase and a 80% increase in reliability....

    If I had to store a terebyte of information I'd be an idiot to use consumer level storage (IDE).


    Nonsense, see above. This is simply SCSI bigotry (I know, I was once a SCSI bigot too). What you say is only true if you are using low end cards, with more than one device on each IDE bus, which is untrue for mid- and high-level IDE-SCSI solutions such as 3ware and various external chassis systems. We run our entire enterprise on one, and have done so for well over a year, with much better reliablity and performance than an older, very expensive SCSI solution provided.

    But yes, if people are plugging drives into el cheapo IDE "raid" cards like Promise and the like, or worse, into their onboard IDE controllers (most of which are inexpensive knockoffs anyway) then performance will be very suboptimal, and reliability problems (one device taking down the entire IDE bus, etc.) abound.

    --
    The Future of Human Evolution: Autonomy
  47. Re:Need for memory/storage [OT] by thing12 · · Score: 2, Informative

    It seems to be just an urban legend.

  48. Cheaper and more redundant if you use 2 .5 tb serv by HiyaPower · · Score: 2

    You can stuff 8 60 gb disks into an antec server case. With a pair of 1600 XP processors, the total cost is 2 promise cards = $50, 8 drives = $720,
    2 xp processors = $220, mobo = $220, memory = $200,
    case = $150, total is about $1500 for .5 tb and $3000 for the full tb. Further, you have a bit more
    i/o bandwidth with 6 ide controllers, and 2 pci busses than with the single. Also when one of them craps out, the other is still going in all probability. Going to 80 mb drives gives you about the same cost per gb of drive space and lets you put .6 tb into a case. When you are paying for floor space and cooling, the 160 gb drives make sense, but when you are tunning these in your basement, going for two boxes makes it a cheaper and more robust solution.

  49. Even cheaper by Restil · · Score: 2

    With 120 gig drives, your total cost for a 1 TB array would be about $2500. With 4 IDE ports and a large enough case, you could get all that into one box, then network the beastie.

    Now I just need to find $2500. I know I won't have a problem filling it.

    -Restil

    --
    Play with my webcams and lights here
  50. Re:Great! Where's the backup solution? by Nugget · · Score: 2

    For those that choose to go the "fire proof box" route, please be careful that you buy a unit that's certified to protect media. A fireproof box that will protect papers from catching fire isn't necessarily sufficient to keep tapes and disks from being destroyed by the heat. Make sure you buy one that's appropriate for your intended contents.

  51. gawd by kin_korn_karn · · Score: 4, Funny

    using a tb array for anime is like having one of your turds bronzed.

  52. Re:RAID by edmudama · · Score: 5, Insightful

    > He's trying to use software raid, but he has 4
    > Promise FastTrak 100TX2 raid controllers. WTF?
    > First off, each of those cards supports 4
    > drives on 2 channels... Why does he need 4
    > cards when he only has 8 drives? He only needs
    > 2 cards.

    I'm a firmware engineer for Maxtor... if you're going for performance, you want 1 drive on each bus, and you don't want to use the motherboard connectors. With 2 drives on each bus, you are limiting the average transfer rate out of cache to 50% of the max transfer rate. On a modern drive with their 60-65MB/sec channel rates, you cannot stream sequentially off of 2 drives without saturating an ATA-100 cable. Even running ATA-133 won't help starting a year from now.

    Additionally, every bios I have looked at sucks in terms of performance. In most cases they have small DMA FIFOs which stutter the pipe during high speed transfers -- they literally hang the DMA lines while they empty their fifo into memory, then come back and grab another 8 words or something sad. They also tend to be very poor managers of the IRQ line. This causes delays at times when your hard drive could be giving you more data, but the host hasn't gotten around to asking for it yet.

    All the 3rd party cards have like 2Kbyte FIFOs which prevents any overrun from occurring, which alone is quite helpful in high bandwidth applications.

    The cards we include with our drives are in the lower end of Promise's spectrum... you can spend more and get more performance if you want to, which is what I suspect the author of the original article did.

    --eric

    --
    More data, damnit!
  53. There's a reason they called it "Terraserver" by jefp · · Score: 3, Interesting

    I've wanted a terabyte of storage since the mid-1970s, when I realized that there were approximately a trillion square meters on the Earth's surface. Store one byte of grayscale image for each square meter and that's a terabyte of data right there.

    Of course these days I'd want 3TB so I could store color images.

  54. Re:Great! Where's the backup solution? by jandrese · · Score: 2
    You've just stumbled across one of the main concepts behind the Storage Area Network. The biggest problem you have is bandwidth. Your average local disk bus (ATA100, or LVD SCSI3) blows away Fast Ethernet, and with RAID3 or RAID5 you need to access multiple machines to do a single write (write the actual data and write the parity data).

    The other problems with your scheme are:
    1. Cost, That's a lot of machines to buy for a single storage array
    2. Admin time. Upgrading the OS for a bug fix is a much bigger pain when you have 8 machines instead of 1
    3. Space. You're building a rack for these machines and they're eating up at least 8U of space, probably more if you want to keep the cost down to Earth
    4. Software. All SAN solutions I know about are proprietary. Nobody builds RAID code that runs over a network.

    It's not a bad idea, but certainly not something that can be done for $5k. I'd think there must be a breakpoint somewhere where it makes sense to build stuff in multiple machines (instead of cramming tons of disks into a single machine), but I think it's not at 1 disk/machine.

    How much uptime you need is purely dependant on you. Since my array is for personal use, I don't mind a bit of downtime when a component fails (since I'm working on the problem myself anyway, it's not like I'd get much use out of it when it was partially down anyway!). If you really really need multi-9 uptime, $5k IDE storage solutions really aren't the way to go.
    --

    I read the internet for the articles.
  55. Re:Inquiring minds by edmudama · · Score: 3, Informative

    > Last time I looked at IDE in any technical
    > depth, I only saw four addresses "reserved" for
    > IDE controller use. I guess you can have any
    > address, but the BIOS couldn't boot off any
    > address, it has to know where to look for the
    > controller. Predetermined list of 4 seems to
    > ring a bell.

    There are 4 addresses, but you can only boot off the first 2 in most operating systems. There are ways to get more than 4 up and running to expand to lots of drives, but not sure what OSs it works with.

    > Secondly, IDE seems to REALLY hit the breaks
    > when you do two independant operations on two
    > drives on the same channel (say, a read on
    > drive 1 and writer on drive 2).

    The issue is that most ATA implementation don't support command queueing, therefore there is no bus release. Each command finishes to completion until the bus is released, while the other drive sits idle. Upcoming drives will be implementing queueing and won't have this performance limitation.

    > If my 4 controller addresses educated guess is
    > right, and performance does crawl, you'd
    > probably want to have 4 drives on 4
    > controllers, one each.

    The secondary port isn't inherently slower than the primary port. However, each port uses a controller address. (0x178 or something for the first, can't remember offhand)

    Best performance is achieved with one drive per cable.

    > If all the above is correct, this guy is plain
    > wrong. He's published, I'm not, I'm willing to
    > admit defeat - where am I wrong? Do the raid
    > controllers emulate being scsi hosts, run off
    > OS drivers (=likely windows ones), etc?

    Everything except ATA hard drives are emulated as SCSI hosts. ATAPI (the CDROM protocol) is simply a packet scsi over an ATA cable. The raid controllers also just use the built-in scsi layer in the OS.

    eric

    http://www.t13.org for the real ATA specs if you're curious

    --
    More data, damnit!
  56. Moore's Law and bioinformatics by alispguru · · Score: 2

    Remember, most of the breathless prose about the huge, enormous, gigantic, [favorite-bigness-adjective] amount of information in DNA was written years ago, by biologists. Moore's law has been in effect for some time since then, and the human genome hasn't gotten any bigger in the meantime.

    --

    To a Lisp hacker, XML is S-expressions in drag.
    1. Re:Moore's Law and bioinformatics by Marcus+Brody · · Score: 3, Funny

      Moore's law has been in effect for some time since then, and the human genome hasn't gotten any bigger in the meantime.

      In fact, the EMBL database (all known DNA + protein sequences) nearly tripled in size within the 11 months of Nov. 1999 - Aug. 2000 [Stoesser, 2001]. Shake your Moore's law at that figure, matey.

    2. Re:Moore's Law and bioinformatics by alispguru · · Score: 2

      Very impressive!

      But, will it continue to grow at that rate? Which will peter out first, computer power or bio data accumulation? Moore's Law has worked for 35 years so far, and Moore himself thinks we've got at least another 15 years from now.

      My money is still on the computers.

      --

      To a Lisp hacker, XML is S-expressions in drag.
    3. Re:Moore's Law and bioinformatics by Marcus+Brody · · Score: 2

      I do believe that the DNA/Protein databases have been growing exponentially for around 20 years, since the Swiss Institute of Bioinformatics started the SWISS-PROT protein database.

      I must admit that the figure I gave was bloated due to the last push to get the human genome, and has probably settled down a little since then. But anyhow, biology/CS have been pretty head-to-head on rate of acceleration for some time, and I cant see any decent reason why either field will slow down for some time yet.

  57. DNA seuqneces aren't that big by pclminion · · Score: 2
    I guess the simple truth is that now that 100 gig drives are a couple hundred bucks, we now have the ability to store anything we reasonably could need (unless you define "Reasonable" as "I need to store DNA Sequences").

    Unless Taco is storing DNA sequences from aliens, I don't know what he's talking about. I downloaded the human genome project last year and if I remember correctly it was definitely under a gigabyte.

  58. The price... by ellem · · Score: 3, Funny

    1 Terrabyte solution - $2500

    All the pr0n you could ever watch - $1,000,000

    The look on your Mom's face when she clicks on AsianDogAssRape10.mpg - Priceless

    --
    This .sig is fake but accurate.
  59. Re:Great! Where's the backup solution? by Amazing+Quantum+Man · · Score: 2

    [I still need a versioning filesystem, like VMS though.]

    I hate to say it, but SCO (yes, SCO) had a versioning filesystem in OSR5. HTFS (High Througput File System) had versioning support.

    --
    Fascism starts when the efficiency of the government becomes more important than the rights of the people.
  60. Re:Great! Where's the backup solution? by Amazing+Quantum+Man · · Score: 2

    You've just stumbled across one of the main concepts behind the Storage Area Network [snia.org]. The biggest problem you have is bandwidth.

    Dude, that's why most SANs are made out of Fibre Channel. FC is a 1GB transport that has a SCSI protocol on top (FCP-SCSI). 2GB FibreChannel is available, and work is currently under way on 10GB. In addition, FC is full duplex.

    --
    Fascism starts when the efficiency of the government becomes more important than the rights of the people.
  61. I see a flaw here... by Anonymous Coward · · Score: 2, Interesting

    We (my company) designed a very similar system using a Tyan Tiger200 with dual GHz Cel's etc. The problem is that the drives he lists (the 160GB Maxtors) aren't addressable by the RAID controller he is using (the Promise TX). The Promise card will only address up to 127GB per drive. You have to use a ATA-133 spec controller to get the full capacity out of those drives. We did an array using the TX and WD 120GB 7200RPM drives (with 8MB cache - mmmmmmmmm.....) that flat smokes anything that you can put together with the Maxtor drives. Oh well....

  62. Usenet ? by _Spirit · · Score: 2, Insightful

    I think Usenet is underestimated here. I remember reading on the site of one of the larger ISPs, specialized in good usenet access (ie. 30000+ groups & week+ retention even on binaries groups) that they have significantly more than 1 TB of storage space (don't remember how much, but several TB). So mirroring Usenet might be a tight fit.

    --

    beauty is only a light switch away

  63. performance issues by john_uy · · Score: 2, Insightful

    i believe that there is more problem in performance rather than capacity.

    a typical configuration that cheap will use an ide hdd (and to make it cheaper software raid).

    the main problem (for us in this case) is the performance. how do you increase the data transfer? for the past few years, the storage space has increased tremendously but the transfer rate of the drives are out of proportion with the space.

    ide is usually placed in a 33mhz/32bit bus which will give a burst transfer of 133mbyte/sec. that is the max whatever you do. but if you will place a nic card, they will share the bandwidth unless it is placed in a different bus.

    for the interface itself, scsi can handle more i/o operations/sec and fc even more. technologies today can implement raid5 at almost no performance hit.

    so given 1tb of data, definite many people will be accessing it (unless you really plan to use it for your insane storage space). so if people will be able to store much, they can access it at a much slower rate.

    so you won't see the scsi and fc being obsolute even though the serial ata gets through. it will remain in the low end segment of the storage market.

    and besides, if you want to backup your data, the best way is to store it to tape and that will cost big (since mirroring the info in another server will not give you the reliability compared to tape)

    --
    Live your life each day as if it was your last.
  64. Wrong Way to go by joeblowme · · Score: 2, Insightful

    This guy totally went the wrong way for expandability and speed. You can get the Promise SuperTrakSX 6000 for $480 and that has hardware raid 5 and supports 6 drives. I'd throw one of those in with 6 drives to start and take my 800Gig and be happy. That would save me at least a $1000 up front. I wouldn't need 2 of the harddrives, the second processor or so much ram. Plus it would be faster and much more reliable. Then later on I could add another one for about $2500 and have 1.6 TB of space to store my huge collection of pornography... err rather mp3's, software and G-rated dvd movies.

    --

    If your not cheating your not trying. If your not trying your not winning and if your not winning why play?
  65. The Antec case is not good for ide raid by gd · · Score: 3, Informative

    I used to build a similar kind of raid system (half a TB) using the Antec case. Their case is nice, but not for the IDE raid. The problem is that the IDE cables need to be within certain length in order to get DMA 5. The case is designed for scsi, which has a longer cable length limit. To hook up all the IDE drive in that case is really a pain in the butt.

    For IDE raid, this case is good except it's a bit expansive:

    http://www.rackmountnet.com/rackmountchassis/rac km ountchassis_4ud.htm

    It can hold up to 16 drives with hot swappable trays. There should be no cable length problem.

    On a side note, I used to plugin 5 Promise Ultra100TX2 cards in one computer. All cards are recognized but only 8 drives are recognized correctly (I plugged in 12 drives altogether). I remember seeing some where (either in linux kernel source or FreeBSD sys source) saying that Promise has a limit of 12 drives per system, with 8 of then in DMA mode, and the rest 4 in PIO mode with some tweak (burst?). So for a big raid like that, an ide raid cards (either 3ware's or high point's) are recommended. Using a hardware raid ide card also has the benefit of being able to hot swap the drives with the case mentioned above.

    --
    gd
  66. Far better solution in my book... by unicorn · · Score: 2

    Would be to replace the 4 controllers, and the monster case. Use a more "standard" chassis. Slap a regular SCSI card in it. And then for the drives themselves, use an UltraTrak100 TX8
    to hold the drives.

    It just seems like a far cleaner solution. Not to mention FAR more expandable. And works out to be about the same price.

    --
    "Politicians are interested in people. Not that this is always a virtue. Fleas are interested in dogs." P.J. O'Rourke
  67. Do it for half with Pricewatch by mangoless · · Score: 3, Interesting

    Storage solution: 1TB RAID5 storage array (Prices are from Pricewatch) Quantity Price Subtotal Intel Celeron 700 MHz w/ Socket 370 MB, UDMA 100, AGP VIDEO 8~64MB shared only, Sound, 56K AMR Modem, 10/100 Network in MidTower case w/Powersupply 1x$135.00=$135.00 Power Magic PCI IDE U/ATA100 RAID Controller w/Cable 4x$22.00=$88.00 Maxtor 4G160J8 5400/133 8x$259.00=$2,072.00 60.0GB EIDE Ultra DMA 5400 1x$85.00=$85.00 Total: $2,380.00 - Mangoless

    --
    [a mango-free monkey]
  68. Better performance.. by tcc · · Score: 4, Interesting

    Get a 3ware escalade card in march they'll support 48bits-LBA in the new firmware, you'll be able to hookup those 160GB monsters in raid-0 (or raid-5) with a tenfold increase in performance, without taking up all the PCI slots.

    the TX2 is a nice little card, but you can only use 2 drives per board for getting the "full speed" (else if you use master/secondary, 4 drives will give you the raid speed of 2 in stripe) and then you'd have to stripe your raid-0 drives in software. Instead of wasting PCI slots and using an underperforming card, you pay a couple of bucks more and you get the real thing with full speed and hardware raid5.

    There are a lot of raid benchmarks at storagereview.com as well. IDE raid is so damn cheap.

    --
    --- Metamoderating abusive downgraders since my 300th post.
  69. Poorly researched and unimplemented vapour by Anonymous Coward · · Score: 2, Informative
    What the hell is this article? It's obvious that they never built this thing. For one the Maxtor 160GB drives use the newer ATA133 standard which increases the bits used for addressing from 32 to 48 to overcome the 137GB limit which hinders the existing ATA100 standard. If they want to use the full 160GB of these drives, they should've speced the Promise Ultra133 TX2 controller.

    They also spec'd the motherboard as an "A7B266-D". I'm guessing this is the A7M266-D, as there is no A7B266-D (no one else is even considering manufacturing an SMP Athlon chipset besides the forthcoming Micron Scimitar)

    It seems to me like this is a rather poorly thought out spec. Why are they using 4 FastTrak100 TX2s when they could use 2 FastTrak100 TX4s? Which of course brings up another point, why are they even using FastTraks? Under Linux the FastTrak driver is quite immature, and last time I used it only worked with 2.2 kernels, which hinders tbe ability to use filesystems like XFS. Also, the FastTrak cards are essentially software RAID as they offload the work of calculating the stripe locations onto the host CPU. There's no point in using md to combine multiple FastTrak arrays.

    Many people were mentioning the 3Ware Escalade. It is a relatively good card, but for a home storage array Linux md + XFS might be a better choice. (Also note that the advantages of 64-bit PCI couldn't be had with the A7M266-D as it doesn't include any 64-bit PCI slots. Perhaps the Tyan Tiger would be a better choice for a 3Ware solution) My recommendation would be 3 Promise Ultra133 TX2 controllers. The read and write performance on an Escalade 7410/7810 is appaling. With the embedded processor on the 7450/7850 (R5Fusion Technology, as 3Ware calls it) the performance exceeds that of software RAID, but at the much more expensive price, of course. I think the goal here is bulk storage and not performance, and the ATA133 controllers are by far the cheapest solution.

    For more information on IDE RAID under Linux, check out this site It's information is a bit dated at this point, but I used it for my home storage server and haven't regretted it. With 5 7200RPM drives on Promise Ultra100 controllers and Linux md RAID-5 w\ XFS, my bonnie++ scores are 90/30MBs for sequential read and write, respectively. I couldn't be happier. This site also has benchmarks showing the superior performance of software RAID over a hardware solution with a 3Ware card.

    And there were a few other things people seemed confused about. No one in their right mind would put more than one drive per channel for the purposes of a performance RAID. That's just foolish. As for the limitation of being unable to access both the primary and secondary IDE channels simultaneously, this limitation was removed years ago with the introduction of EIDE.

    In as far as everything else goes, I'm a SCSI bigot. I have SCSI drives in my workstations and I couldn't be happier. However, IDE RAID is a very economical solution for a home user, often with performance on par with that of more expensive SCSI RAID solutions.

    To conclude, this article seems very poorly researched and documented. Had they actually attempted to build this beast and failed, perhaps I would've been more amused. However, as stands it's an overpriced specification which uses incompatible parts, and little research has been done on the optimum parts for the configuration.

  70. The human genome in a 800mb .zip file. by autopr0n · · Score: 2

    Actualy, when the Human Genome first got online, I downloaded the thing as an 800mb zip file. Because I could. It was only a few gigs uncompressed. Unless you needed to store the whole genome for a couple people (rather then, say, diffs) current tech works fine. Hrm, a little odd knowing that the whole Human Genome is only about four or five times the size of a Divx movie.

    --
    autopr0n is like, down and stuff.
  71. not really by autopr0n · · Score: 2

    The bandwidth is pretty good, but it's the latency that'll kill you.

    --
    autopr0n is like, down and stuff.
  72. raidweb by NetMasta10bt · · Score: 3, Informative

    Ok. This is just inane. Why build this when someone has already done it better for cheaper?

    http://www.raidweb.com

    We purchase their 8 disk IDE RAID arrays. They are hot swap, support RAID 0, 0+1, 1, 3, 5, and hot spare, have dual failover power supplies, come with 64MB cache, which can be upgraded. Configurable via the EZ front LCD display, or via serial console. They support ATA-100, and ATA-133 coming shortly. Software upgradable, and it runs Linux.

    They array (sans disks) runs us $3200. They even have versions that have dual fiber ports out the back.

    WARNING - DO NOT purchase these with IBM GXP75 (75GB) disks like we did... we have about 80 of them that failed.

  73. Re:Great! Where's the backup solution? by greenfly · · Score: 2

    I could be mistaken, but I didn't know that one could hot swap IDE devices. I thought they didn't really take kindly to you pulling them out of a running system. That means that you end up having to power down your system each time you want to take a backup home.

  74. more then that. by autopr0n · · Score: 2

    Therefore 57MB required per human

    You still need indexing information. You need to spec where those diffrences occour.

    --
    autopr0n is like, down and stuff.
  75. Uh, no by autopr0n · · Score: 2

    One base always matches up with the same one. Cytozine with Guanine (CG), Atozine with the 'T' one (AT) and the reverse (GC, TA). So you only need to record half of the pair.

    --
    autopr0n is like, down and stuff.
  76. Another design by heretic · · Score: 2, Informative

    I just built a similar setup -- 500GB for less than $2,900. However, I made some different design choices.

    First of all, I wasn't too impressed with the Promise controller, so the choice for me was between the 3Ware 7850 and the Adaptec 2400A. The Adaptec had the best overall performance, but the 3Ware is close and can support 8 devices. For the hard drives, I wanted to come reasonably close to SCSI performance, so I chose the WD1000JB drive with the on-board 8MB buffer. I used a Tyan Tiger K7 with 64-bit PCI for the motherboard with dual Athlon XP (not MP) 1700+ CPU's plus 1GB ECC registered PC2100 DDR RAM. Put them all in a nice aluminum rackmount case.

    I'll probably replace the motherboard with the newer Tyan with 66MhZ PCI bus in the near future and use the current one in a workstation. I'll also drop in more RAM if/when prices drop.

    It's been pretty sweet so far with LVM + XFS. My backup solution is a 33GB tape drive, so I spend most of every Sat. backing up the array. Time and money permitting, I'll build a second one and look for a DLT tape library on ebay.

  77. software RAID by DavidJA · · Score: 2

    and if you use software RAID via win2k

    PLEASE do not ever used software RAID on a production file server! Esp. Win2k's implimentation of software RAID!

    We use to run a software RAID on a file server (serving only 10 macs mind you!) - Both using 4x9 gig SCSI drives (a while ago); and 4 x 30gig IDE drives

    Everything runs OK until you need to replace one of the drives; then the performance whilst rebuilding absolutly sucks!

    I've seen the system take over 12 hours of production time to rebuild a 90 gig software RAID; all time performance for network users absolutly sucked!

    The solution; good quality hardware RAID; we now run a compaq 5200 hardware RAID card; and all compaq drives: I can pull a HDD out right now; put a new one in and have the RAID re-built without any network user noticing....

  78. 1.4TB at my company by glwtta · · Score: 2
    and we are using a mere fraction for our DNA info storing needs (which I am sure has been said a hundred times already). DNA sequences themselves are tiny (comparatively) it's the annotations that take up the space, but even that is under a TB for most needs.

    I can't remember exactly right now, but Celera's storage was something like 100TB, wasn't it? Of course when you are actually doing the sequencing and annotation of the whole damn thing, you need more space. (of course they weren't using nearly all of it, and it also included stuff to service their "subscription" clients, each one of which would of course get a significant chunk to store their stuff...)

    any one have more recent (or more exact) info?

    --
    sic transit gloria mundi
  79. Offsite backups to my home by Paul+Johnson · · Score: 2
    We have 2 PCs in this house and I back up both to a TR-4 tape drive. Then I take both tapes to work and bring back the previous week's backup.

    Over the years we have put so much of our lives on to the PCs that we would be seriously lost without the archive.

    Paul.

    --
    You are lost in a twisty maze of little standards, all different.
  80. Re:Great! Where's the backup solution? by Paul+Johnson · · Score: 2
    Err, me, as I mentioned in another post in this thread. My wife and I realised some years ago that we had an awful lot of our lives on disk. Since I have tape backups, taking a tape set to work to keep in my desk seems a trivial precaution.

    Paul

    --
    You are lost in a twisty maze of little standards, all different.
  81. Re:Great! Where's the backup solution? by Paul+Johnson · · Score: 2
    Check out Amanda. You give it a backup cycle of D days and a tape cycle of T tapes where T is usually 2D+3 or so. Drives to be backed up from around the network are listed in a config file. Over a period of D days the system will (if it can) schedule a level 0 (full) backup of everything and also do a level 1 (diff against last level 0) backup of every drive every day. Sometimes it has to drop to level 2 (diff against last level 1) to make it all fit.

    When you want to recover something a browser lets you traverse the directory tree and tag the files you want. Then Amanda tells you which tapes to mount to recover them. Cool!

    Paul.

    --
    You are lost in a twisty maze of little standards, all different.
  82. Fire costs by Paul+Johnson · · Score: 2
    It takes maybe a month to locate new premises, buy new equipment, wire everything up and get people moved in. Maybe less if you are a small organisation or have someone talented doing it. The costs of doing this are paid by your fire insurance (which you have, of course). This is still not free because you are not running your business at the time so your cash flow situation may get tight.

    But if you lose your data you lose your business, and no insurance is going to cover that. Years of work goes up in smoke.

    Paul.

    --
    You are lost in a twisty maze of little standards, all different.
  83. Re:Great! Where's the backup solution? by isorox · · Score: 2

    Similarly I read something about a 1GB/hour VHS backup system about 5 years ago. With packs of 5 standard 4 hour VHS tapes costing about £5, that works out 25 pence a gigabyte - about half the price of cdr's.

    However even the best video tapes will degrade very quickly compared with optical, computer tape systems, and even IDE hard drives.

  84. Re:Great! Where's the backup solution? by Chewie · · Score: 2

    As far as the 40/80 GB max on DLT, IBM and Compaq both offer larger backup solutions, LTO and Super DLT. Compaq has embraced Quantum's SDLT, which has a capacity of 110/220 GB and a transfer rate of 11 MB/sec (uncompressed). Search speed is roughly 4.5 meters/sec. IBM has embraced LTO, which uses 100/200GB tapes, has a transfer rate of 15 MB/second (uncompressed, and a >2x increase over 40/80 DLT at about 6 MB/second uncompressed), and has an on-tape chip which can hold an index of all the files on the tape for easier retrieval. The search speed on LTO is about 6 meters/sec.

    Now, all of this is useless without being "generally available", so I did a little price-checking. Below are internal single-drive units (no autoloaders), and list price from manufacturers:

    Compaq 40/80 DLT Drive (internal) - $3,499.00
    Compaq 110/220 SDLT Drive (internal) - $5,590.00
    IBM 100/200 LTO Drive (internal) - $3,999.00

    Just wanted to point out that there are other options.

    --
    49 20 68 61 76 65 20 74 6F 6F 20 6D 75 63 68 20 66 72 65 65 20 74 69 6D 65 2E
  85. What's the big deal? by cfulmer · · Score: 2, Interesting

    Sam's clum in the area recently had a WD 100GB dide hard drive on sale for $120 after rebate. 1TB at that rate is ~$1320, plus a few hundred for the Motherboard, processor memory and extra controller cards, and a TB server is within reach of an 18-year-old who saved his paper-route money.

    The real question is: how long will it take to listen to all those mp3s? At some point, extra storage just isn't practical because you can't fill it fast enough.

    1. Re:What's the big deal? by Rader · · Score: 2

      Well, that depends. I have 9,800 albums in MP3 format, taking up about 570 GB. If I could fit them all on the HD, I'd be able to trade and double it in a couple months.

      Thus, for me, having enough hard drive space causes me to trade faster, thus need more hard drive space.

  86. Re:Inquiring minds by ryanwright · · Score: 2

    Do the raid controllers emulate being scsi hosts, run off OS drivers (=likely windows ones), etc?

    Yes. As far as the OS is concerned, each raid controller is one big giant SCSI disk. There are no master/slave disks, each controller has 8 ports for 8 drives - again, only one drive per channel, no master/slave.

    The IDE RAID controller is the thing that makes this work. It takes care of all the issues you mentioned (drive limitations, booting, speed issues, etc). But since you can only have 8 drives on a single controller, you put in multiple controllers. With 3 controllers, you can get 24 drives. At 120GB a pop, that's 2880GB. You'll lose some of that to RAID but you're still looking at close to 2TB. Then you do a software raid 0 on the 3 drives (as far as the OS is concerned, you have three huge scsi disks) and you can create one giant partition with very acceptable performance.

    --
    -Ryan, with the unoriginal sig
  87. Calculus, Clippit and 640k of RAM by BigBlockMopar · · Score: 2

    I guess the simple truth is that now that 100 gig drives are a couple hundred bucks, we now have the ability to store anything we reasonably could need (unless you define "Reasonable" as "I need to store DNA Sequences").

    Doesn't "640k ought to be enough for anybody" suggest that Bill Gates once felt the same way about RAM?

    Of course, visionary that he is [snicker!], there's no way he could have imagined desktop machines being used to edit video.

    Likewise, who knows how big and bloated Clippit The Office Paperclip can get if we have 100 gigs of hard disk space to burn... maybe, one day, he'll actually bear consultation when you need information, instead of when you need something to laugh at.

    I love calculus so much, I want to give it to everyone! Come, get some integration!

    MmMMmmm... calculus. Hours spent in the dentist's chair, with him scraping hard crusties off my teeth... And you're just giving that stuff away?

    --
    Fire and Meat. Yummy.
  88. Promise is Linux Unfriendly by PhotoGuy · · Score: 2

    I'm surprised no one has mentioned this, but Promise has become more and more Linux-unfriendly lately.

    There's different minor revisions of the 100Tx2 controllers; you can only tell by looking at the chip on board, I think only the last digit is different. I could not get the latest ones working with Linux at all. I ended up buying these boards under the Maxtor brand name (same units, but slightly older), which had the older chip set.

    On the latest boards, it seems Promise appears to have intentionally made certain registers read only, thwarting open source driver development.

    With that kind of behaviour, I'm staying away from Promise controllers, period. (I also had a hard time with their Raid5 controllers.)

    Back when they were Linux-friendly, their ATA100tx2 cards were nice. But with the latest incompatible chipsets and no help from the company, forget it.

    I also had some frustration with Adaptec's 2400 controller. It is *still* only supported by Adaptec under RedHat 7.0. And it has no audible alarm for drive failure, most annoying. Finally, under FreeBSD 4.3, it's performance was abysmal; there was definitely something wrong with the I2O driver working with this card. (I haven't tried 4.4 yet.)

    For now, I'm just sticking with motherboard IDE controllers; far more tried and true.

    -me

    --
    Love many, trust a few, do harm to none.
  89. Just be very careful. by Bob_Robertson · · Score: 2

    If you do store DNA sequencing information, make sure you only use lossless compression.

    Or, for that matter, the issue for me is backup capasity again. With the advent of DVD-R (or whatever it's called today) I thought that "full backups" were going to be possible again. But now, with such vast quantities of data possible to have online and changing, backup issues again come to the fore.

    Lossless compression helps, but now I'm stuck writing not 50% of a 4-Gig tape over the weekend, I have to write two or three full tapes.

    As memory and disk space has become cheaper, bloat-ware uses more and more of it. I don't consider bloat-ware a good thing, but it cannot be fought any more than the monster shopping mall can be fought just because I happen to like mom and pop shops.

    The difference between information and data, I guess. The next great invention I think will be the personal digital secretary, like the ones detailed by Daniel Keys Moran in his wonderful "books of continuing time", designed to sift through the impossible quantities of data yet still have the personal touch to say "Gee, that bit over there looks interesting. I think Bob would like that."

    Bob-

    --
    The Ludwig von Mises Institute. The reasoning individuals economics
  90. Removable Disk Drive Drawers for Backup. by billstewart · · Score: 2

    For about $20-30, you can get disk drive drawers that turn a 3.5" drive into a 5" removable drive. Nothing active; it's just a bunch of mounting hardware. (About $20 for the part that stays in your machine and $10/disk for the removable drawer parts.)
    This makes it easy to use disk drives as backup media, which is good, because they're much faster than tape. It also makes it easy to upgrade your disk capacity when you want to do that.

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
  91. Re:Great! Where's the backup solution? by jandrese · · Score: 2

    Fibre Channel hardware tends to be a little too expensive for the $5k crowd. This guy is using commodity hardware, and that generally doesn't include fibre channel. Even if he bought the hardware, the driver support for something like a SAN just doesn't exist in Windows/Linux/BSD yet.

    --

    I read the internet for the articles.
  92. Nominal size by Tony-A · · Score: 2

    You don't say 1.024k bytes, you say 1k bytes and expect the listener to know that about 1000 is exactly 1024 due to the context. If 1k bytes were always 1024 bytes, how would you interpret 14.112k bytes?

    3/4" pipe is 1.050" Outside Diameter.
    The 3/4" refers to an Inside Diameter of a pipe with a particular wall thickness (which may or may not still be made). Regardless of how thick the walls are, and consequently what the Inside Diameter really is, 3/4" pipe is 1.050".

    IIRC there is something about a US bushel being a different volume depending on what is being measured.