Hard Drives Made for RAID Use
An anonymous reader writes "Hard drive giant Western Digital recently released a very interesting product, hard drives designed to work in a RAID. The Caviar RE SATA 320 GB is an enterprise level drive without native command queueing and uses an SATA interface. In works better in RAID than other drives because of features like its time-limited error recovery and 32-bit CRC error checking, so it is an option when previously only SCSI drives would be considered."
"In works better in RAID..."
You should change "In" to "It"
Thank you very much.
You can't sell your oh so cool hardware review site for millions of dollars and retire at age 12 just because Cowboy Neal posts your article on slashdot!
p.s. Pay attention in English class.
Summary of article:
The Good (+)
- Very good performance
- Looks cool (for a hard drive)
- Optimized for RAID use
The Bad (-)
- High initial investment
http://www.wdc.com/en/products/Products.asp?DriveI D=92
I bought one to replace what I thought was a bad drive in a RAID configuration about a year ago.
Proper TechReport's review here.
Go read. Now!
It's not an error by NewEgg. Follow the link to the manufacturer's site, and you'll see the same specification:
http://www.wdc.com/en/products/Products.asp?DriveI D=114
In snort, without NCQ, SATA drives are going to be slower than SCSI. The other two features probably just offset/mitigate the speed differences, but I would probably hold out for something that has NCQ (or just go SCSI) if I were building a RAID today.
Make sure everyone's vote counts: Verified Voting
On the newegg link they list the MTBF as 1 million hours. Google tells me that that is about 114 years. How can it have such high mtbf?
MTBF is defined as [short time period] * [number of drives tested] / [number of drives which failed within that time period]. An MTBF of 114 years doesn't mean that half of the drives will survive for 114 years without a failure; it means that if you run 114 drives for a year, you should expect to have 1 failure.
A more intuitive way of conveying the same information is to say that the drives have an expected failure rate of no more than 1E-6 per hour.
Tarsnap: Online backups for the truly paranoid
Easy: You, like most people, don't know what MTBF means. MTBF is only meaningful in context with the expected lifespan of the device. This is probably somewhere in the neighborhood of 5 years, or about 43,800 hours. Essentially, what the manufacturer is saying is "Based on some data, we estimate that if you run x number of these drives, the average time between failures will be 1,000,000/x hours, up until the expected lifespan of the drive, at which point all bets are off"
For computer hardware this is always some sort of extrapolated estimate, since they have of course not actually been testing the drive for it's expected lifespan, or it would be obsolete by the time they released it.
Why?
NCQ allows hard drive to reorder various commands/accesses to suit its current head position. Depending on your app you might not see a lot benfits from it e.g when you do serial access all the time but lack of it will certainly cause degradations when multiple apps are active. Also by using one big hard-drive instead of multiple smaller ones its putting all eggs in one basket. Mechanical problems are more frequent than magnetic ones for a hard drive..
This sig doesnt exist.
RTFA - they used a different type of encoding on these drives in order to implement the 'time-limited error recovery.' The problem is that the encoding is done on three-vector bi-furate substrate instead of the two-vector bi-furate substrate used in the Raptors, and the 3V stuff can't handle speeds of the 10k RPM (the lateral acceleration at 10k RPM is significantly more than at 7,200 RPM, and the 3V stuff is taller than the 2V stuff - hence the problem.)
Glonoinha the MebiByte Slayer
It also depends what you want to be doing with it. I've played with both hardware and software RAID5 and home and at work. Software RAID offers excellend bandwidth, and seems to use very little CPU time. This is why I think a P3 should work. However, the seek time is terrible. Perhapse it has something to do with the RAID intelligence being located so much farther away from the drives than it would be with a dedicated RAID card. I've tried running an SQL server on soft IDE RAID on a dual Xeon 3.2, and it had the snot kicked out of it by a dual P3 700 with an ancient MegaRAID driven SCSI array.
As for running it as a home directory for Win/Mac/Linux, between Samba and NFS you should be just fine. You may even be able to go the fancy route and set up a few logical volumes as iSCSI targets and run your own SAN.
Never eat more than you can lift -- Miss Piggy
Well, serial attached scsi started to ship
_ story1a_SAS_technology_home
;)
http://www.adaptec.com/sas/index.html?source=home
Pro level already moving but I suspect it will be OK for home with enterprise features it offers.
I checked a bit you know
8 drive on four controlers.
You could get around that if you were to use a Adaptec Serial ATA RAID 2810SA with 8 ports or a more expensive Adaptec Serial ATA RAID 21610SA with 16 ports.
You might look at the price and say too expensive but the speed and availible configuration should make up for it. Besides i got might for around $425 wich is less then thier suggested price. Also both these cards can use the waisted space from mismatched drive sizes as well run multiple raid volumes one each drive. What i like the most is the hotswap and hotspare were you could just leave a blank drive in and if one other drive failes it automaticaly recovers with the spare and you can replace the bad drive without rebooting. Another thing i like about the card is that it is a full controler and not one of these host based things. Your computer will just see it as a harddrive(s) without any special drivers. You can even access them from DOS, most linux kernels, as well as windows 95 and 3.11 (note the drives had to be small for 3.1 and 95 to see correctly).
BTW, i don't work for adaptec or sell thier stuff. I'm just impressed with a product that finaly took alot of frustration away that has been associated with cheaper IDE and SATA ad-on cards. I'm sure there are better solutions availible. this is just one that i have found. Most of the cheaper (under $100) IDE,SATA,or raid controlers i have found use the system for thier existance. This is why you need a special driver in windows or linux to use it corectly. the extra cominucations here could be somethign saturating you pci bus (or helping it saturate)
One of the major reasons for the high price of most hardware RAID5 solutions, is the hot-swap backplane. If you are OK with a solution where you would have to shut down the server in order to replace a bad drive (which would be OK for most home use I would image), you can find some *very* cheap hardware RAID controllers ($50, for both ATA and SATA) that will do the job just fine...
Buffalo TeraStation
Supports RAID 5.
I emailed if external USB hard drives could be added and swapped to a raid 5 array, and if it can be done "on the fly"...
but all I got was this lousy message:
"Please call (800) 456-9799 x. 2013 between 8:30 and 5:30 CT and our presales guys will be able to assist you."
I'm one of those weird people that would rather communicate in writing. Oh well - no sale.
Spoon not. Fork, or fork not. There is no spoon.
I would think if these drives are really designed for RAID (like other drives have been in the past), then they would have support for synchronized spindles.
The idea behind synchronized spindles is that in order to read data from a disk, you have to wait for the platter to come around part of a revolution for your data to become available, just like picking up your suitcase on the luggage carousel at the airport. How long you need to wait is a matter of luck, because the disk can be assumed to be in a random position when you decide you want your data. When you have RAID without synchronized spindles and you want data that's bigger than the stripe width (or when you're writing and need to update the parity), you have to wait for multiple disks, and they will tend to be spread out so that you tend to wait longer than if you were just waiting for one. With synchronized spindles, as soon as the whole group hits the right position, you've got what you're looking for, and you're done.
So, the point is, not having synchronized spindles tends to increase average access time, so having synchronized spindles is a desirable feature for a drive designed specifically for RAID.
Again, RAID is not an excuse to not back up your server. If you don't back up your server, you will lose data, no matter how many drives you have in a raid-5 array!
Not to niggle, but your assessment is incorrect. RAID-0 is equally as redundant as RAID-1. There are two disks performing a job that either one of them could do alone. The second disk is redundant. RAID-0 uses the redundancy for performance instead of continuous backup.
OK, so it was just to niggle.
Eloi are stupid, throw morlocks at them!
Since I wanted some facts, Wikipedia ordered two systems for database service, both dual Opterons with 4GB of RAM and six drives. One with 10,000 RPM SCSI drives and one with 10,000 RMP SATA drives. The SATA system, without NCQ, was generally faster and ended up with a higher proportion of the site load assigned to it. The SCSI system was sometimes faster in mixtures which included lots of writes with lots of reads and that made it lag a bit less in replication of bulk update operations, so newer systems have been SCSI. If more drive bays had been available, adding another couple of SATA drives would probably have made the SATA set faster for that case as well and still cheaper.
:)
If lower access times are needed, SCSI drives beat SATA drives just because you can only get 15,000 RPM with a SCSI interface. May also make sense to have 15,000 RPM drives if you're already spending a lot of money on 16GB of RAM.
The question about this drive which interests me is whether drive write caching can be easily turned off and will stay off, so you don't lose database data when the database thinks the data has been flushed to the surface but it hasn't really been flushed. If you can't do that, it's unsuitable for a lot of database work - certainly unsuitable for use with RADI controllers with battery backed up write caches, where you have the battery to make sure you don't lose cached data if the power goes off. Anyone who things colo power and UPS will protect against loss of power hasn't suffered enough yet...
So, I have seven database servers, all with identical copies of the data. Do I really care if I lose all the data on one of them because one drive in a RAID 0 set fails? The completely redundant systems do the job better than any RAID setup can.
You consider RAID 0 when you don't care about losing the data if there's a drive failure and want the benefits of striping and the extra space available for a given number of drive bays, compared to other RAID levels. RAID 5 can get you some of the space but it's slower for database work.