Slashdot Mirror


Where are the High-Capacity SCSI Drives?

An anonymous reader asks: "Storage technology has really exploded in recent years, giving us ATA drives up to and exceeding 200-250 GB per drive. Why is it that SCSI drive technology has remained stagnant? I can't find a SCSI drive exceeding about a 146 GB capacity. Instead, businesses (and some individuals) wanting greater storage capacities are required to buy more drives which takes up more space, generates more heat, provides more points of failure, uses more electricity, etc. Why is this so?"

28 of 138 comments (clear)

  1. My Guess by MBCook · · Score: 4, Insightful
    My guess is a simple one. Who buys SCSI stuff? It's expensive so it's mostly businesses and others who need high reliability (which one of the major reasons SCSI is more expensive). Now while normal people can "afford" to lose 250, 300, or more GB of data, for a business that could be worth billions of dollars.

    The solution to this reliability problem is the RAID. There are two RAID levels that are ideal (there are more, but this is a simple explanation). There is 1, which is just a mirror; and 5, which is striping with parity.

    With RAID 1, if you have 500 GB of data, you would need 2 500 GB drives. You lose 50% of the capacity you buy. The other option is RAID 5, where you lose (1/number of disks). So you could store 500 GB of data on 6 100 GB disks. This way you've only lost 100 GB of storage to redundancy as opposed to 500 GB.

    So when businesses want to store large ammounts of data, it's more economical to use many smaller drivers than to large drives. Even if you don't need the redundancy (for example the disk is just being used for temporary storage while working on large digital picture or video files) they it's still better to use many small disks. While using a 500 GB drive will only go so fast (lets just say 60 MB/s sustained), by using a RAID, you can mulitply that. So by using 5 100 GB drives, you might be able to sustain 300 MB/s (assuming the bus can keep up, etc). Even if you only scale at 50% (that would be 150 MB/s) that's still 2 to 3 times faster than a single drive. That performance can save you money.

    So, if you can afford it you can get much better performance or economics from using multiple smaller drives from one large one.

    That's my theory/understanding. Begin tearing it apart!

    --
    Comment forecast: Bits of genius surrounded by a sea of mediocrity.
    1. Re:My Guess by Robbat2 · · Score: 4, Informative

      RAID is a wonderful concept, but work needs to be on points of failure other than the drives.

      Most decent external RAID units today have dual hot-swappable dual power supplies and fans. However there is still only a single backplane and RAID controller board (IBM PowerPC chips are very popular for this) involved. I've both a backplane and controller a fail on me in the span of 2 years, in both cases taking all the data with them. These units were 6x200GB IDE drives, 1TB usable, 1 parity drive, and we had several cold spares available to hot-swap in on a failure.

      Sure I agree that statistcally your drives, fans and power supplies are much more likely to fail than the backplane or controller, but it can still happen.

      Never forget the important of having backups, and make sure you can recover from them as part of implementing your backup solution. (1 month rotation of Ultrium tapes here).

      There is a solution to the above, but it's very costly, and that's RAID over distributed storage (iSCSI and the like).

      --
      ICQ# : 30269588
      "I used to be an idealist, but I got mugged by reality."
    2. Re:My Guess by Alphanos · · Score: 2, Informative

      I think it is more likely that high rotation speed doesn't combine easily with high capacity. If you are spinning the disks more than twice as quickly as standard ATA drives (15k vs. 7200 rpm), then having the same data storage density isn't going to work without new technological developments. In other words, when the disk reading head moves at twice the speed, the bits need to be roughly twice as large. This is why the first CD drives didn't read at 52x: they needed time to develop the technology that allows the reading of that data density at that high of a speed.

      I'm not familiar with which mathematical formula is involved, but from this perspective, 150 gb scsi drives operating at twice the speed seems reasonable compared to 300 gb ata drives. I suspect a similar reason is responsible for the low capacity of the 10k WD Raptors (serial ata drives) which have capacities of only 36 or 72 gb!

      --
      Alphanos
    3. Re:My Guess by lylonius · · Score: 2, Interesting

      You make good arguments, but reliability and storage capacities are only two of the issues involved.

      The largest benefit is performance. Gamers invest so much in their system bus, cpu, and memory, but disk i/o is 5 orders of magnitude slower. if performance is key, a small investment in SCSI improves disk intensive apps considerably.

      1. IDE requires CPU cycles. SCSI buses have embedded ICs that handle queuing of data and such, freeing the CPU to perform other tasks.

      2. IDE channels are shared. Most IDE ribbons allow for two devices, but one device can talk on the channel at a time, much like CSMA/CD, whereas SCSI allows you to daisy-chain 7 or more devices to simultaeneously talk on the same channel.

      3. IDE is not bidirectional. similar to (2), this causes read/writes to wait.

      one reason that you didn't mention, that falls under reliability is SCA. this interface combines signalling, power input and data i/o, which enables hot-swappable SCSI drives, critical to any non-appliance or diskless system that requires high availability.

    4. Re:My Guess by innosent · · Score: 2, Informative

      Higher-end controller cards (read: NOT IDE) can share a bus with another controller, allowing two systems (with a controller in each) to share access to a single (external) array, with dual power supplies, and technologies like SSA allow even the cabling to be redundant.

      Of course, by the time you spent the money on this type of setup, you could probably have purchased another complete machine, with another array in it, and used software to handle redundancy and updates to the array. We did this with our SQL server setup. We have two machines with redundant power supplies and RAID-10 arrays, setup in a load-balancing cluster with one as an updating subscriber to the other (updates are sent between the machines real-time, losing about 10% on write performance, but doubling read performance). If you share the array between machines, you will save the 10% or so on write performance, but won't gain the read performance, so it all depends on your primary usage for the storage system. Of course, you can't completely avoid downtime (something WILL happen eventually), but by having a separate system, you can reduce the chances of having downtime, and reduce the length (since only one has to be up).

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
    5. Re:My Guess by innosent · · Score: 3, Insightful

      Gamers? How many gamers really NEED large (>147GB) disks? SCSI drives are not produced for gamers, they are produced for business workstations and servers. I agree with you about SCSI being better, but the reasons you gave don't apply to all IDE controllers (number 1), and certainly not to all SATA controllers/disks (all reasons). A GOOD (i.e. usually not onboard, probably something from 3ware, etc.) SATA controller has a processor, command-queueing, separate, bi-directional channels for each device, and SATA connectors are designed for hot swapping (better than SCA actually, even to the point of connections being made in sequence due to staggered pins). I've got a 12-disk SATA RAID-5 array at work, and don't have any of the problems you listed, because I hand-picked the hardware to avoid those (and other) limitations. If you really want your games to run as fast as possible, then it's going to cost a few thousand dollars anyways, and if you really need that much space, maybe it'd be a good idea to buy a decent controller.

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
    6. Re:My Guess by megabeck42 · · Score: 3, Informative

      1. No Longer True.

      This has, in large part, disappeared with the advent of UDMA. It was true that IDE was very cycle expensive a decade ago when the IDE really meant Internal Disk Eletronics. The IDE "interface" was just a set of tri-state latches and the CPU would be responsible for pushing and reading every single byte. If you ever look at the pinout for an IDE cable, it's no surprise that it very closely resembles the ISA bus. Another historical note, ATA means AT-Attachment because the first set of IDE drives that were really popular were designed to attach to the IBM PC AT (the successor of sorts to the IBM PC XT) bus.

      Now, processors queue dma requests in and out of the drive and the "interface" really has grown up to be more of a "controller." They're not as complex as the SCSI adapters, of course, but then again, SCSI is a much more complex signaling system.

      2. No Longer True.

      What you're trying to describe is called as "bus disconnect." I'm not sure which side of the bus was responsible, however, the idea is that while a drive was processing a command, the bus was locked until the command finished.

      Note, the first version of SCSI did not have Disconnect either. However, given many more devices sharing the bus, bus contention was more severe, especially using slow devices like tape drives and cdroms, that it became necessary rather than just a feature.

      SCSI supports disconnection as well as Tagged Command Queueing. TCQ allows the host to issue multiple outstanding commands to the device. The device is allowed to complete these commands out of order. Many drives will reorder the requests to take advantage of the head movement.

      Recent revisions of IDE include support for TCQ.

      I will add, however, that it is still worthwhile to have only one device per channel. Compare this to putting more than two 15K drives on a U160 channel.

      3. Not even remotely true. SCSI is a parralel bus, much like IDE, ISA, or half a dozen others. Its only possible for one device to drive the bus at one time. This is clearly evident since a few of the lines in the SCSI cable are used to indicate the Target of the bus transaction. There is only one set of these signals, therefore, there can only be one target.

      Also, the electrical interface for Serial ATA is designed with hot-swap in mind.

      While your first suggestion is accurate, disk i/o is very slow and SCSI equipment tends to be of better quality than IDE hardware. SCSI drives with higher spindle speeds have much lower latency, which can lend a dramatic difference to a similar computer with IDE drives. However, that difference is of no fault of IDE. I would encourage, you, in future to be more accurate with your information.

      If you believe I have written inaccurately, I would recommend reading the draft documents from INCITS T13, the ATA technical comittee.

      --
      fnord.
  2. Re:What speed are most SCSI drives? by MBCook · · Score: 3, Interesting
    SCSI will be replaced by SAS, or Serial Attached SCSI. That is basically a superset of Serial ATA (IIRC). All the benefits of going from ATA->Serial ATA apply to SCSI->SAS. The smaller cables, the longer lengths, the lower voltages, etc. SCSI has had command queueing and hot-plug for a while already though.

    Everyone is going serial. USB, SAS, Serial ATA, etc. Time to invest in Kellogs.

    Oops, wrong "cerial".

    (sorry for the pun, couldn't help it).

    --
    Comment forecast: Bits of genius surrounded by a sea of mediocrity.
  3. Re:What speed are most SCSI drives? by MBCook · · Score: 2, Interesting

    Yeah, I was thinking IDE and a standard 33mhz 32bit PCI bus. Your right, SCSI goes up the 320 and the bus can handle it if you use PCI-X, PCI express, 66mhz PCI, 64bit PCI, etc. Nice catch.

    --
    Comment forecast: Bits of genius surrounded by a sea of mediocrity.
  4. Re:What speed are most SCSI drives? by walt-sjc · · Score: 3, Informative

    Check out the MTBF numbers. They look similar until you see that desktop drives are rated with a low duty cycle - the typical 8 hour day as opposed to the 24 hour day servers are deigned to run.

    As for real performance, my old 18G 7200 RPM IBM scsi drives are faster than my brand-new SATA raptors in real world applications (compiling the linux kernel for example.)

    So here's what I do. I use my scsi drives for my everyday stuff, and archive on the SATA drives (MP3's, old source / packages, etc.) That way I get my performance and reliability, and space. Since I have two of each, I just raid mirror.

    As for real world server applications, we run some Large raid arrays. We don't need the space as much as we need the performance you get with dozens of spindles spread over multiple channels on 64bit controllers.

  5. THE ANSWER by icandodat · · Score: 5, Informative

    This info is from an IBM Magnetic Storage Engineer. The reason is that the IDE market is a retail home market and very competitive. He said "If an IDE manufacturer can save 5 cents on a component he'll buy the cheaper one". The time from R and D to store shelf is less than a year. For SCSI drives on the other hand are primarily for servers and they have expensive components and are tested for a long time before they reach the market. The time from R and D to store shelf is about three years for SCSI. what was the bigest drive you could buy three years ago (ide)? Thats right about the same size as the biggest SCSI drive today. So ... what does this mean? IDE drives suck, they are cheap they are the zip lock bag of the storage industry. If you are going to grandmas with your data thats ok but if its going to the moon... buy tupperware, (SCSI).

    1. Re:THE ANSWER by Anonymous Coward · · Score: 3, Interesting

      I always find comments like this amusing. It's just not true.

      Not long ago I had to set up a several terabyte array (around 4 TB) using SCSI drives. We were constantly replacing the damn things. And this was supposedly quality hardware from Sun. Now, with as many drives as we had, there were bound to be failures. Eventually the failure rate stablized at about 1 or 2 drives per month. A rate which continues to this day, some 3 years later.

      Previous to that array I had helped set up a similar system using PC components and IDE drives. The array was actually nearly twice as big at around 7 TB but cost less than the SCSI array. Guess what? In the last 3 years only 1 drive has failed. One drive.

      Which one is more realible?

      Fuck SCSI.

    2. Re:THE ANSWER by Kevin+Burtch · · Score: 2, Interesting


      I'm very curious which Sun array this is, and which drives you are using.

      I've worked in the Sun market for well over a decade, and I haven't seen failure rates like you're describing since the old Seagate 2.9G 5-1/4" full-height drives they used to have in their "Mass Storage" cabinets (the ones that looked exactly like a SPARCcenter 2000)... and that was only after the drives were out of production for a few YEARS (all replacements were refurbs).

      My guess is you have serious environmental issues... heat/humidity due to a non-datacenter environment (do you have raised-floor-cooling? is it under 70F?), or non-isolated air (is the A/C air-handler the same one used for the rest of the building?), or you have non-isolated power and have regular spikes (in Miami maybe?).
      You need a _true_ UPS system, not an SPS labeled as a UPS.

      --
      - Preferences: Solaris 10 (servers), Ubuntu (desktops), Solaris 11 (personal servers) -
    3. Re:THE ANSWER by Phillup · · Score: 2, Interesting

      Were the IDE and SCSI drives rotating at the same speed?

      --

      --Phillip

      Can you say BIRTH TAX
  6. They do exist! by MarcQuadra · · Score: 4, Informative

    Hitachi/IBM produce the 300GB UltraStar 10K300, which is a mighty drive if I've ever seen one.

    The real reason is that when you move up to higher rotational sppeds to reduce latency, you have to reduce density relative to the motion of the disk under the head, so a 10K drive can generally pack only 60%-ish as much data per-inch as a 7200RPM drive.

    The same can be seen in 15K disks, which are much lower density than their 10K counterparts. The 15K platters are smaller too, to keep them from flying apart.

    Do you remember when the 5400RPM disks had higher capacity than the 7200 ones? I sure do, it was for the same reason.

    Until the latency of the read-write head improves this will be the case.

    --
    "Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
    1. Re:They do exist! by Pegasus · · Score: 3, Insightful

      Heck, give me then 3600rpm disks with transfer speeds of 20mb/s and capacity of 2Tb! I'd gladly have dozen of them to put my dvd collection on.

      I've heard some things about the new Hitachi 400gb drive being optimized for tv settop boxes. Does that mean that it's optimized for linear reads/writes? If so, why did they not decrease rpm in order to gain more capacity?

  7. Re:What speed are most SCSI drives? by MarcQuadra · · Score: 3, Informative

    Agreed, I've said this before, but my old 18GB Ultra2Wide (80MB/sec SCSI) drive can wipe the floor with my new DeskStar 180GXP (ATA-100).

    It's all about those command queues, they let the computer spit commands at the disk without having to see their immediate completion.

    I actually get better performance with my SCSI drive _mounted over NFS_ than I can with my previous local 40GB ATA-66 drive.

    --
    "Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
  8. ...provides more points of failure... by strabo · · Score: 2, Funny
    ...provides more points of failure...

    Yeah, that's a problem. It's much better to reduce potential points of failure... preferably down to a single point of failure.

    Or is that not what you meant?

  9. Too slow to be useful? by TheLink · · Score: 5, Informative

    Drive speeds haven't really gone up tremendously. Still too slow.

    Imagine you have a 1TB drive, but were stuck at a 100MB/sec max seq transfer rate. It takes you 2.7 hours to read/write the entire drive. And that's for _sequential_ access. Gets ugly for random seek.

    A similar speed 10TB drive will take you more than a day (27+ hours) to read sequentially.

    Before the point where it takes too long to read an entire single drive you might as well start using multiple drives to add capacity rather than having bigger drives.

    Taking too long is subjective, but I'd say this: how long can you make your boss/customer wait whilst you are restoring an entire disk image from backup? 27 hours or 2.7 hours? or 25 minutes?

    So 70GB would be about the limit if you have impatient users and bosses.

    Larger capacities are OK if they are to hold data that aren't important enough to be backed up, and don't require masses of data to be available quickly. Or you are doing mirroring and read speeds are important but write speeds aren't as important (but remember that restoring from backup = writing ;) ).

    --
    1. Re:Too slow to be useful? by Cecil · · Score: 2, Informative

      You forget the important fact that as the drive DENSITY increases, so does the amount of data read per revolution of the platters. Bigger drive, faster transfer rate. Unless you're talking about limits on things like ATA, but those are being replaced and upgraded as needed.

    2. Re:Too slow to be useful? by TheLink · · Score: 2, Interesting

      "You forget the important fact that as the drive DENSITY increases, so does the amount of data read per revolution of the platters"

      The _evidence_ of actual transfer rates is more important that your "important fact".

      This might be helpful. Select WB99 transfer rate - Begin.

      If you have evidence of significantly faster single drives do let me know.

      --
  10. Re:What speed are most SCSI drives? by Micro$will · · Score: 2, Informative

    IDE sucks the life out of PC, even newer 3ghz+ pc's still pause when you put in a floppy or eject a cdrom in windows.

    That's a drive and/or Windows issue. When you insert a CD, the CDROM has to spin it up to read it, and then Explorer.exe (not Internet Explorer, Windows Explorer, A.K.A. the Windows "shell") immediately wants to know what's in it, so you have a slight lag, depending on background services, the drive, the media condition, etc. You can see what's going on by opening up Explorer while there's no CD in the drive, then watching the drive's reaction, and Explorer's reaction when the drive finally reads the CD.

    Some performance can be gained by keeping CDROM drives on seperate IDE devices to prevent the CD drive from hogging the bus during moderate to heavy harddisk activity. As for floppy drives... you still use a floppy drive?!

  11. Re:Time to do some reading by shaitand · · Score: 2, Insightful

    "Business do not want high-capacity single SCSI drives"

    Why? they make for higher capacity raids.

    The more devices the controller has to be able to handle the more expensive, also although more drives means better overall performance, the overall efficiency goes down.

    The big thing is like beggers home users have pretty much been locked out of SCSI. Even a single scsi drive yields better performance than an IDE drive.

    If scsi drives were offered widely in home pc's, there would obviously be a performance increase, but also most of the artificial price increase would disappear from them. Manufacturing methods would also be improved to support the increased demand and controllers would be included onboard as IDE are now. Before you know it SCSI would be just as cheap as IDE is today.

    Wouldn't that be better?

    It would be good for other things too, for instance faster buses would have already taken off in your standard desktop instead of just being a secondary bus in a few desktop boards and server boards.

    The result in the end would be cheaper, faster, more reliable storage and bus technologies for everyone. Too bad the drive manufacturers know that and do their best to keep scsi on the server ;)

  12. Re:What speed are most SCSI drives? by innosent · · Score: 2, Interesting

    Some SATA RAID controllers do support the advanced features offered by high-end SCSI RAID controllers, it's just that it seems strange to spend $700 on a controller for $1000 in disks, vs. spending $500 on a controller for $4000 in disks. Most recent server boards support 133MHz PCI-X, so bandwidth to SCSI devices is not an issue. The difference is speed and quality. Still though, if you don't need the absolute fastest array (ours is about 610MB/sec read, 300MB/sec write, RAID-10 striped within, mirrored across channels), SATA is a good solution. We use 6 15k U320 SCSI disks for our primary database system (54GB total array size), and 12 7.2k 250GB SATA disks for our document imaging system (2TB, RAID-5 with hot spares). Sustained transfers from either array are practically identical, but the access times to the SCSI array is much lower (though this is partly due to differences between RAID levels, as having a RAID-10 array means that the closest stripe to the data can get priority). Either storage method easily outperforms Gb Ethernet, so the only real difference is the access times for processes performed on the local machines (such as index and data lookups for a database), and aside from a few ms difference in query time, which is usually much smaller than the running time for the query, a remote user can't tell the difference.

    --
    --That's the point of being root, you can do anything you want, even if it's stupid.
  13. Reliability by MrResistor · · Score: 3, Interesting

    My company was offering 180GB SCSI drives in one of our RAID products, but we had to stop due to reliability issues. There was a huge difference in reliability between the 180GB and 146GB drives (which we still offer).

    --
    Under capitalism man exploits man. Under communism it's the other way around.
  14. The reason is speed. by Dirttorpedo · · Score: 3, Informative

    I will try to avoid the SCSI vs IDE flame war.

    1) RPM. It is easier to spin a 2.5" platter at 15K than a 3.5" platter. (someone else can figure out the addtional energy but I would guess more than double the juice adduming uniform density.)

    2) IOs per second. In large arrays the driving factor is not necessaraly throughput but IOs per second. Which leads to more transactions per second for your server farm. So more spindles = more IOs per second.

    3) Access time. The bigger the drive the longer it takes the drive's processor to position the head. Therefore increasing access times. decreasing IO per second. I now its a trivial amount of time but it adds up over millions of IO.

    4) Error correction. I cannot speak for IDE but each block on a SCSI drive has an Error Correction Code (ECC) which helps the drive recover from read errors. Again minimal.

    5) Cynical answer. Smaller drives means your drive company sells more product to meet a given capacity.

    educational point. SCSI is a protocol like IP or TCP. It can be tunneled through or carried by anything.
    SPI -SCSI Parralel interface (old school).
    FCP - Fibre channel protocol
    SAS - Serial attached SCSI. SAS can also tunnel SATA.
    iSCSI - scsi in TCP. (not ethernet)
    SBP - SCSI Block Protocol. firewire.
    ATAPI - yep SCSI ove IDE so your CDROM works.
    many others.

  15. Re:What speed are most SCSI drives? by afidel · · Score: 2, Interesting

    storage review says about 64MB/s on the outside of the platter for the best performing 15K RPM drive (which is a 74GB drive not a 140GB one). So, to swamp an U320 bus with sustained transfers you will need at least 6 drives, not the 3 that some people keep spouting around here. So if you need 6 drives to saturate the bus, why have a few high capacity drives when more drives gives you lower latency and gets you to max sustained transfer.

    --
    There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  16. Capacity vs. Speed by Detritus · · Score: 2, Interesting

    I've read of companies that bought a bunch of SCSI drives and then set them up to only use half their normal capacity, by throwing away half the cylinders. This reduced the average access time of the drives. I'm not sure if they reconfigured the drives in-house or if the manufacturer did it for them.

    --
    Mea navis aericumbens anguillis abundat