Ask Slashdot: IDE Software RAID?
Edward Schlunder asks:
"After setting up
Software RAID on a SCSI system at work,
I want to do the same at home for fun.
Call me crazy, but I'm just completely
geeked up about this after seeing it working.
The Software RAID documentation says that
each hard disk should be on a separate IDE
cable and that RAID5 requires at least 3
hard drives. I want to use my two existing
IDE hard drives and get the large, fast,
and cheap IBM IDE ATA/66
Deskstar 22GXP hard drive to make up
the third..." There's one small problem
though. Hit the link for more.
"My motherboard only has two IDE ports. So, my question is, what IDE controller card can I get that satisfies the following:
- Supports Linux (obviously!)
- High speed, preferrably ATA/66 and PCI
- Lets you use multiple controllers in one system (that is, it can co-exist with the onboard IDE controller on my SuperMicro P6DBE motherboard)
Please refrain from suggesting that I should just use SCSI -- the goal here isn't absolute greatest speed and reliability, but a cheap way to teach myself more about RAID5 and provide a test system to blow things up on without causing users unnecessary grief ;-)"
That limitaton was fixed when EIDE came out, now you can talk to both drives on one controller at once, though apparently it still doesn't do that as well as SCSI does. So if he just wants to play he should be able to do it with just two controllers.
Otherwise the Promise PCI DMA/66 card sounds nice.
-dantheperson
For controllers, western digital has URL's at this main link. Click on the "Solutions" link to go to suggested solutions for common problems with UDMA/66. For prebuilt IDE RAID system, try here.
As long as you have good IDE controllers (no huge bottlenecks), try FreeBSD's RAID/LVM system "Vinum." It would require trying an OS other than the media baby of today, but that's definitely worth it anyway.
If you _REALLY_ want to see great performance, try FreeBSD using Vinum and setting SoftUpdates on on the Vinum volume.
(Now just watch this be moderated down for being a troll, because I suggested something different...)
Brian Fundakowski Feldman
The definition I usually see is: Redundant Array of Independant Disks. The Inexpensive has been changed/dropped by a lot of people because RAID in general is everything but inexpensive (just look at hardware RAID racks--mucho $$$).
I read the internet for the articles.
As many people have already noted, using the Promise Ultra33 is an excellent way to approach Software RAID with IDE drives. There is some very useful information at Erik Hendriks' website at NASA Goddard:
http://www.beowulf.org/bds/disks.html
They found that most of the "dual" channel IDE ports built into motherboards are not truly independent because of a shared buffer in the controller. This is a "feature" of the IDE controllers used and effectively limits the collective performance of the two IDE channels to roughly that of a single channel. The IDE channels on the Promise board are truly independent though. As their benchmarks show, placing one drive on one of the motherboard controllers and one on each of the Promise controllers yielded nearly three times the disk bandwidth of a single channel. Of course, this is for data striping, not RAID5, but the principle is the same.
For those interested in building a RAID5 server, this configuration makes a lot of sense. Use two disks on each channel of the Promise and two disk on the motherboard controller...five data and one parity and roughly 3x the bandwidth of a single drive.
No. In fact with 2D/1C (2 disks on 1 ctrlr), you will still get getter performance 1D/1C in most cases. Under general use, you're doing small-size reads distributed across the disk, so the real bottleneck is head-seek. Even with big contiguous-block reads, you'll still notice an improvement.
First off, RH-kernels are far from stock linux kernels. Do an 'rpm -qpl [file].src.rpm' on one of their kernel SRPM's and you'll see a bunch of (non-dist) patches. Amoung them is the raid patch
Support for the new Ultra/66 hasn't hit the 2.2.x tree yet(I think). Check 2.3.5+ for new Ultra/(33,66) support.
( I've never tried it ) I suspect that you'd might see (marginally) better read-speeds, but you might even see degradation on write or mixed rdwr perfs ( since every write yanks two out of three heads across the platters )
This won't work for raid5, not unless you want most of the large disk unprotected. Consider instead striping (for example) hda3+hdb3==md0, and then making a raid0 or raid1 volume md0+hdc3==md1.
Better yet, get four disks of the same size...
For hardware raid controllers, yes, go with identical disks. This is not needed for any kind of s/w raid I've dealt with (linux, disksuite, veritas-vm, xlv). For linux s/w-raid, you should be safe making a raid5-vol by mixing two ide-partitions, a scsi-disk, a loopback off of a file and a few NBD's (so long as they are the same size).
Scary, risky and very unwise.
So I'm not the only one...
/* MAGIC THEATRE
ENTRANCE NOT FOR EVERYBODY
MADMEN ONLY */
RAID-5 uses a number of disks (minimum three) to store data. The bits are spread across the disks, with one last disk acting as a 'parity' disk (in real situations, parity information is spread across the disks). When data is written to the disk, the parity bit is calculated and written to the final disk. When data is read (in normal circumstances), the parity information is ignored and the normal data read off.
When the parity disk fails, nothing special happens except that the parity information is not stored. When another disk fails, performance dies, as reads have to be 'reverse-engineered' from the parity information. Once the disk is replaced, the information is rebuilt from the parity data.
Mirroring is simply writing the data to two places and reading from a random disk; if one disk dies, data is simply read from the second disk. Since there is no calculation involved (the data is simply written to two places), reads and writes are much faster. However, this is more expensive in terms of hardware required.
As you've pointed out, this is a bad idea! The reason for using RAID-5 is reliability, and dumping 11 partitions of your RAID volume on one disk is asking for trouble! You will gain a little reliability if your disks tend to get bad sectors, but that's about it. Since RAID-5 will slow down your disk writes (and, to a lesser extent, reads) you only ever use it for reliability in the face of disk failures.In addition, that 22GB disk is going to slow down the rest of the system; writes and reads will require data from the entire length of the disk, which is not good.
On a home system, it's not really worth the headaches and performance hit for it. Just take regular backups and you'll be ok. If you really want extra reliability, use mirroring; it's a lot faster.
--
> IDE has no notion of "disconnect" like SCSI
Ah, but it does. In fact the IBM disks mentioned actually implement the ATA-4 disconnect/reconnect and tagged command queueing (depth 32). Kudos to IBM.
This would be a great addition to the Linux IDE driver.
To clarify the point, software RAID under Linux (any mode) does not absolutely require that each hard disk be on a seperate controller. I have had plenty of success using Software RAID on drives on the same controller. I haven't seen system performance bog down too much with this configuration either. On the newer bus-mastering Ultra33/Ultra66 controllers, CPU time for IDE access isn't really as big a problem as it used to be. So, if you're just talking three drives for a test machine, I don't know that the extra expense for a slick PCI IDE controller is going to be all that justified. Try it with your onboard controllers and then upgrade if you decide you need it.
Another question is this: Is there any support in Linux for IDE Hardware RAID controllers like the Promise FastTrack, FastSwap Pro, or SuperTrak? Obviously, Hardware IDE RAID solutions are much less expensive than traditional SCSI RAID controllers and drives and can offer comparable performance on smaller workstations or smaller workgroup servers.
~GoRK
Be careful with this one or that fryguy might be visiting you sooner than you think ;)
The safest way to handle an IDE hotswap is to unplug the power first and let the drive totally spin down, and then unplug the data cable. When powering a hotswapped drive back up do it the other way - data cable then power cable. Never mess with that data cable if the drive is running. Unplugging the data cable first can cause bad things to happen, or so I have been told. We had a hardware course at the college I attended. The guy who was teaching it really knew his hardware and that was the way he recommended doing it. He explained why but it has been too long... something about toasting the controller.
A neat trick based on this is hotswapping for data recovery. If you lose a hard drive to a bad ondisk controller and you have another identical hard drive, boot from the good drive, then follow the above steps to swap them - chances are that you can get your data back this way unless the controller on your bad drive is really fried.
Just my $0.02
Hell is being intelligent in a world full of idiots.
Places I've worked at commonly throw old Adaptec ISA controllers in the junk bins never to be seen again until someone rips them off. You might want to check Ebay - an ISA SCSI card + older CD-ROM shouldn't set you back that much. There are also newer 'budget' PCI SCSI cards with no BIOS, which I think is OK if you are booting from IDE.
--
Business. Numbers. Money. People. Computer World.
A typical motherboard today has two onboard (E)IDE controllers, called the primary and the secondary. The primary controller is usually assigned IRQ14, and the secondary IRQ15.
Each controller can control two (E)IDE devices, but can only actually read or write to ONE of those devices at a given moment. The first device is called the 'master' and the second device the 'slave'.
However, the controllers are independent of each other, meaning that you can access a drive on the primary controller while simultaneously accessing a drive on the secondary controller.
For example, one way to essentially double your disk swapping performance is to put half your swapspace on a drive attached to the primary controller, and the other half on a drive attached to a secondary controller. (Note that when 'swapon'ing the swapfiles or swap partitions, they need to be assigned the same priority. 'man swapon' for details.)
Hope this helps.
mdm
And you're helping whichever OS you use... Of course, you're an "Anonymous Coward", so you are probably just one of the 'nets great trolls... Jeez, if you REALLY want a troll, how 'bout this one?
linux sucks, Linus sucks, ESR sucks, Windows rules forever!!!!
Hmmm, I wonder if I can set the record for lowest score on
[rant on] But seriously, the Linux community is in danger of falling in to the same trap that is fighting against the Macintosh. Fans of the system are becoming so rabid in their fanaticism that they take offense at any slight against it, even if it's true. Take the [ominous music here] infamous Mindcraft survey; other, independent sources (Ziff Davis, maybe not the greatest of sources, but still independent) have confirmed that under the testing conditions supplied, Linux really is slower than Windows NT Server. But, do most Linux users sit down and say "Hmmm, well, that's a surprise. Now how do we go about fixing Linux so it is faster?" No, the vast majority of the posts on Slashdot were ones to the effect of "Mindcraft is evil, they must be burned at the stake for heresy!"
Remember, in your religious pursuit, don't go so far as to refuse to accept facts, just because they go against your beliefs. Personally, I think that the worst at this is none other than good 'ol Eric S. Raymond. Yup. He is the Rev. Falwell of the Open Source movement. Fine, fine, he did plenty of good things, but he should stick to coding, as he does not make a good spokesperson.
[rant off] Remember, we should not only be open source, but open minded.
Another non-functioning site was "uncertainty.microsoft.com."
The purpose of that site was not known.
Promise (and most likely, other companies,) makes a port expander card. This is an older card, but still does the job. EIDE compatible, no UDMA on this one. This will give you 4 IDE channels (most modern motherboards have 2 built in, this gives you an ADDITIONAL 2 channels.) It can be found here. Note: This is an ISA card.
Promise also makes their Ultra33 expander card. This card supports UDMA33, and once again, adds an additional 2 channels. It can be found here. Note: This is a PCI card.
For those who really want speed, once again, Promise comes through with their Ultra66 expander card. This card supports UDMA66, and, like their previous cards, adds 2 channels, leaving your original 2 free for other devices (or more hard drives). It can be found here. Note: This is a PCI card.
By giving your machine 4 IDE channels, you will have the option of connecting up to 8 IDE devices, including hard drives, cd-rom drives, and the like. You should (if I'm thinking correctly..) be able to read/write from 4 of these devices simultaneously (one device from each channel). This is probably what the HOWTO or whatever is talking about (needing 3 controllers/channels/whatever). Accessing 2 devices on the same channel will be somewhat slower.
Don Head
UNIX/Linux Administrator
I'm running a software IDE raid on my box, Pentium 233 MMX, and it runs fine, if a little slow. Using the default 2 channels built into the motherboard, I have PriMaster as the boot drive and the PriSlave+SecMaster+SecSlave as the Raid5. I'm sure it would be faster with separate controllers / cables for each drive, but I was looking for redundancy, not speed.
I'm curious about IDE software RAID as well. Here
are a couple bits of info that may be relevant. I
haven't tried any of this yet.
First, http://www.linuxhq.com/doc23/ide.txt
This claims 2.1/2.2 kernels have:
> - support for up to *four* IDE interfaces on one or more IRQs
> - support for any mix of up to *eight* IDE drives
And further in the document there is info on
configuring such a system, which claims that you
can run as many as 6 interfaces (3 controllers?):
> This is the multiple IDE interface driver, as evolved from hd.c.
> It supports up to six IDE interfaces, on one or more IRQs (usually 14 & 15).
> There can be up to two drives per interface, as per the ATA-2 spec.
>
> Primary: ide0, port 0x1f0; major=3; hda is minor=0; hdb is minor=64
> Secondary: ide1, port 0x170; major=22; hdc is minor=0; hdd is minor=64
> Tertiary: ide2, port 0x1e8; major=33; hde is minor=0; hdf is minor=64
> Quaternary: ide3, port 0x168; major=34; hdg is minor=0; hdh is minor=64
> fifth.. ide4, usually PCI, probed
> sixth.. ide5, usually PCI, probed
For UDMA/66, the only controller I know of is the
Promise one. From the Ultra-DMA Mini-Howto:
> 5.2 Promise Ultra66
>
> This is essentially the same as the Ultra33 with
> support for the new UDMA mode 4 66 MB/sec transfer
> speed. Unfortunately it is not yet supported by
> 2.2.x
>
> There is a patch for 2.0.x and 2.2.x kernels
> availabe at
> http://www.dyer.vanderbilt.edu/server/udma/, and
> support is included in the 2.3.x development
> kernel series at least as of 2.3.3.
>
> However to get far enough to patch or upgrade the
> kernel you'll have to pull the same dirty tricks
> as for the Ultra33 as in the section above.
You mail also want to check out the linux raid
mailing list
http://linuxwww.db.erau.edu/mail_archives/.
Good luck! Please post your results to the mailing
list and/or comp.os.linux.hardware.
Joel Auslander
ausland@digital-integrity.com