IDE, SCSI And Recording Everything
Raju writes: "For many years we were told that SCSI is superior to IDE. I always made my systems with SCSI and the others in the household got el-cheapo IDE disks. In the past SCSI beat IDE hands-down but now according to Simson Garfinkel, "today's IDE drives are significantly faster than SCSI drives". In the article at O'Reilly Network he talks about the tests they had run for storage of network data on disks. In the light of this article does anyone see any reason for going with SCSI in a desktop machine? For servers with heavy disk usage patterns it might be different due to command queuing." Disk types aren't what the article's really about, though -- it's a top-level look at network forensics (including advice on building a traffic-analysis system), and makes some interesting points about the unbalanced growth of storage and bandwidth.
I have two sets of IDE controllers on my system. Each disk I have has its own channel and controller. Because I get to use cheap IDE disks, the cost is much lower than SCSI and the performance is right on par with it. Its not the technology -- its how its applied and used in real life.
--Kevin
The article's authors needed a way to store large amounts of network log data quickly - they're trying to capture packets in real time. For that kind of straightforward use (large volumes of data, only one user, no simultaneous read/writes) it's easy to see why IDE is more cost-effective and speedy, as the article states. However, when you add multiple users trying to write multiple drives simultaneously, the story changes, and the article simply doesn't address that.
What's your damage, Heather?
I agree that it's the reliability that's the big factor.
Ever try to add 8 IDE devices to a system? With SCSI it's a snap as long as your power supply is large enough.
I think this is very application specific though.
As if the tens of thousands of times this has been hashed out weren't enough already...
The question of IDE vs. SCSI is not (or should not) be about speed. Really. There are nice, fast drives in each camp. If speed is all that matters to you, go with IDE, it'll be a lot cheaper.
So are there any advantages to SCSI? Sure. But not for the majority of people. SCSI's beauties are:
- You can hook a LOT of drives to one controller
- You can hook most any kind of device to the controller
- You can hook devices up both inside and outside of the case
- You can use much longer cables
- When the controller is waiting on one command, it can issue other commands while it's waiting
SCSI was designed for systems where you would either have many, many devices connected to the controller, or where many different processes (or users) would be accessing the hardware simultaneously - and in either of those situations, it *does* perform better than IDE. However, the portion of systems that will actually enter into that area are very, very few. In general, "if you have to ask, you don't need it."
As for straight speed, if you're looking for all-out throughput, don't rely on a single drive, get a RAID array - be it IDE or SCSI. By getting a faster drive, you can increase your throughput by what - 10%? 20%? A two-drive array will nearly double your throughput, and with quality controllers, it's fairly linear up through three to five drives - again, depending on the quality of the controller.
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
The main SCSI advantage is not that it's faster in I/O than IDE (although it used to be). The really big advantage was that (and I think still is), that on a server under heavy memory and processor load, SCSI will outperform IDE because most of the logic is moved off the CPU and onto the SCSI card. So when the CPU is pegged, IDE crawls, but SCSI keeps on chugging.
I think one of the big things is that processor speeds have kept on shooting up, meaning that while IDE has been considered a serious contender for small to mid- sized servers increasingly over the past few years, it's now becoming much more plausible to use it on higher scale systems.
The point of SCSI is that it allows the disk access and such to be offloaded from the CPU to the processor on the SCSI card. This way your programs don't freeze when heavy disk access occurs.
That's probably true. For example, you can buy a n 80GB western digital 7200RPM drive for $150. That is $1.88/GB. The only 7200RPM SCSI drive made these days is the Seagate Barracuda, which is $300 for 36GB: $8.33/GB.
That really isn't the point of SCSI though. I'll accept that IDE wins on a money-per-GB basis. But, IDE has a performance ceiling that SCSI doesn't have. You can't get 10000RPM and 15000RPM drives for IDE at any price, period.
There is a point, when building RAID systems, where SCSI exceeds IDE in the $-per-I/O-per-second metric. In desktop systems, you probably won't exceed this point. But if you intend to have stripe sets of 4 or more disks, SCSI will win the price wars again.
Anyway it really isn't a matter of SCSI being expensive and IDE being cheap. It's the drives that are expensive/cheap and it simply works out that expensive drives get SCSI connections and cheap drives get IDE connections.
P.S. Have fun trying to get you 4-disk IDE RAID all within 18 inches of your IDE controller :)
There is nothing like the metal on metal sound of a high quality SCSI drive. Also you cant find an IDE drive to make the high pitch whine like the 10,000rpm Cheetah. The IDE drives make weak plastic sounds, or almost make no sound at all.
My vote is for the low-CPU usage of scsi devices. My hardrives, DVD drive, CDROM, and CDR are all SCSI.
... start copying around movie files while it does this, and you'll become a SCSI fan real quick.
:| It really does make a difference, though lately IDE has been too cheap to ignore.
I can run three instances of grip and rip/encode from all three drives simultaneously. Desktop still runs like a champ, it doesn't bog down. Rip from one IDE drive and it does ok
Sure, I may pay $400 for an 18GB SCSI drive, but it's worth it.
I agree.
However, I want to see not one, but eight IDE drives outperforming eight SCSI drives doing heavy I/O. That's the crux of the question for servers. For desktops, just go IDE and that's it.
I wish it were possible to moderate the initial article submission as being off-topic, because from what I have gathered from actually reading this excellent article is that the individual who submitted this story completely overlooked the primary topic on which the article was written...
The speed comparison of SCSI vs. IDE was most certainly referenced within the story context of the story; however, that was by no means the intended takeaway that the author had for his readers - it was but a supporting factoid of his other conclussions and thoughts. The article was a very written analysis, history and summation of the practice of Network Forensics. While it did cover a wide range of technologies (including hard disks) that aid in the collecting of such forensic intelligence, by no means was his observation of the increased speed of IDE drives intended to monopolize the reader's attention or be the central focus!!!
Even worse, the majority of posters have (unsurprisingly) focused on everything but the article's intended subject matter. Now ensues the typical flame-war of people supporting their preferred technology instead of having intelligent discourse concerning this exciting and evolving new field of I/T security...
Oh well...if you can't beat them, I suppose you might as well join them! For the record, my vote remains with the tried and true performance and quality of SCSI...
Beer is proof that God loves us and wants us to be happy. -- Benjamin Franklin
according to Simson Garfinkel
Hello SCSI my old friend
It's getting very near the end
SpamNet - a spam blocker that really works
For the past five years I've run my system exclusively with SCSI components. When I first went out and bought a SCSI controller and a disk I paid a fortune for the privilege. At the time, the UW controller cost me $150 and a 9Gb IBM drive ran me another $300. The controller's SCSI BIOS added another 5 seconds to my boot time, and the IBM drive was full-height and loud as hell on account of it spinning at 10,000rpm. Regardless, I was a happy camper. I had consistently fast disk access, low latency and--best of all--I didn't get those annoying entire-system pauses while waiting for disk accesses to complete!
Over the years the benefit running SCSI decreased. First bus-mastering IDE channels came along and got rid of the annoying pauses. Then they started turning up the clock speeds with UDMA 66, 100, and so forth, until my aging SCSI drives could barely compete with even an average IDE drive.
Naturally, I did what any self-respecting bithead would do: I upgraded my SCSI components. By that time (circa 1999) the price gap between IDE and SCSI had narrowed somewhat (this was before IDE storage prices bottomed out) and I was able to purchase two 18Gb SCSI drives for a mere 25% than the equivalent IDE drives would cost me. And once again, I was happy with decent performance, low latency and high throughput.
Two weeks ago, I found myself scrabbling to free up a few megs and realized it was that time again, time to upgrade my storage. Looking at Pricewatch, I noticed that IDE drives are now cheaper than Big Macs and come in similarly absurdly-sized portions. Would you like 160Gb of space for your MP3s? No problem--they've got you covered, at $200 a pop! Meanwhile, relatively few vendors have stayed on the SCSI bandwagon, demand for SCSI drives is mostly limited to legacy systems that don't support an IDE bus, and a 160Gb SCSI drive will cost you $900.
In the face of this incredible price ratio, I did what any self-respecting bithead would do: I threw in the damn towel. Now I'm in a transitional period where I run 36Gb of fast UW SCSI storage and 160Gb of even faster IDE storage; I have a SCSI DVD-ROM drive, a SCSI CD burner, and an IDE DVD+RW burner, I/O controllers are fighting each other to the death to secure an interrupt, and the inside of my case looks like the aftermath of a tragic explosion at a cabling factory. I'm damned lucky my system is water-cooled, because I doubt any system fan could pull enough air through that morass of ribbon cables to make a difference in cooling.
The moral of the story: SCSI had its glory days, but it just ain't cost-effective anymore. And with Serial ATA looming on the horizon and promising God's own transfer rates, it just doesn't make any sense to buy SCSI.
That's only true if the program is doing disk I/O asychronously. If your program is doing I/O inline with its execution, it will be paused just as long reguardless of where the disk I/O computation is being done.
personal attacks hurt, especially when deserved
Although many people discuss the superiority of the SCSI protocol vs. the IDE protocol, this is not really the question.
Manufacturers produce the fastest disks on the planet on SCSI interfaces only. There are no 10K/15K RPM IDE discs, period. If one wants the lowest access time available today coupled with respectable transfer rates, one must purchase a 15K RPM drive, which are only available in SCSI interfaces.
For single-user access patterns, the author is correct to state that IDE drives have the lead today. StorageReview.com recently reviewed the latest 7200 RPM Seagate SCSI offering, and it was beaten down in single user tests by half a dozen of the newer IDE drives; however, when tested with server access patterns, it was the clear leader (excluding higher-RPM offerings, of course.) Still, 7200 RPM drives can't beat 15K RPM drives in any access pattern.
And I noticed the author was RAIDing drives -- 3ware's RAID products are very high quality, and their performance exceeds each and every other RAID card out there, SCSI or IDE interface. That surely contributed to his conclusion that current IDE drives are faster than their SCSI counterparts.
Join the NFSNET. Our prime goal is making little numbers out of big ones. http://www.nfsnet.org/
If you check out Storage reviews File Server Benchmark database, you'll see that the fastest ATA drive scores well below half what a 15,000 rpm Fujitsu drive does.
"Only in their dreams can men truly be free 'twas always thus, and always thus will be."
--Tom Schulman
FireWire Faq
Sure USB2.0 is about the same speed as FireWire, but FireWire hasn't been standing still - it's next version calls for speeds of 800Mbps and 1.2Gbps. There's even plans for fiber and wireless based versions.
However, even more import is that FireWire is PEER based. A computer is not required to transfer video from one device to another. There's already a bunch of video equipment that has FireWire support, camcorders as well as the Playstation 2(Sony calls it i.LINK instead of FireWire or IEEE 1394) come to mind.
While it might be possible to hack USB 2.0 for use without a computer, USB 2.0 wasn't designed for it. I suspect such a hack would be a successful as the "patched on security" we see in Windows.
On the surface, I would agree with you. However, the planned usage of the disk space in question becomes an important point.
I had this conversation with Greg Oster, a friend from University, who wrote the NetBSD RAIDframe implementation. We were considering setting up a large network server. After doing some number crunching, something became very very very clear: unless we were going to be moving to Gigabit Ethernet, 3 IDE disks in a RAID configuration were going to be more than sufficient to fill our 100MB LAN.
The point is, whether IDE will be "good enough" depends on what you're using it for. For a large fileserver, IDE RAID may well be good enough, depending on you local LAN. For video editting and other purposes where the data is used on the machine where the disks reside, SCSI's command queueing may be the better choice.
Beta is technically superior to VHS.
Novell Netware is technically superior to Windows NT.
SCSI is technically superior to IDE.
Does any of this matter to most of the market? Not really, since most people look primarily at up-front cost. I've been telling my customers (mainly small businesses) that mirrored IDE drives are the best value for general purpose data storage. The gap has narrowed; IDE definately makes more sense for most people (and even most servers) these days.
If I were specing out a system for high-end video editing, or a system that absoulutely had to process thousands of transactions a second, or a general purpose file or e-mail server that supported thousands of users, or a GIANT SAN, I'd go with SCSI. SCSI shines in really big storage pools, or in places where you absolutely need the fastest possible speed. But for most things, IDE undercuts SCSI by a longshot.
That said, there is one major problem with IDE, and it's not bandwidth (as most "higher-end" IDE-RAID controllers (such as some of the new ones by Adaptec) have multiple channels for multiple drives) - it's lack of VERY standard chipsets & APIs needed to access IDE block devices. The original spec has been hacked onto so many times that you're really at the mercy of the manufacturers' drivers for any "sophisticated" IDE implementations. This has gotten me into trouble several times. SCSI drivers tend to be more plentiful than high-end IDE drivers, and the testing cycles seem to be better because OS vendors actually care about them.
But again, people who buy IDE just on the technical merits of it may as well throw their money away. I wish the situation were different, but I don't think it will change unless drive vendors DRASTICALLY lower SCSI drive prices. Right now they're getting away with charging lots of extra dough simply because managers are hearing "SCSI is way better!" from their employees when purchasing hardware. That may have been true a few years ago, but it'll take a few years for the general consensus to swing in the other direction. (I really, really like SCSI too, and I think IDE sucks as a technology... but money talks) :(
I have a dual P3-850 (was a P2-450). Under heavy CPU load it remains suprisingly responsive. However, if it's under heavy disk load, it crawls, even though Ultra-ATA isn't very heavy on CPU utilisation.
My previous machine was a single PPro-200 with SCSI disks. Under heavy CPU load, it crawled horribly. However, under heavy disk load, it remained much more responsive than my current system.
Therefore I conclude that SCSI really does perform better, even if the drives themselves are matched on throughput and access times. I think most benchmarks suffer a little from tunnel-vision and focus only on the raw disk performance without really taking into consideration what it all means in real world situations.
I put up with the worse overall performance of IDE because it's so much cheaper. Of course, I'm up to my limit (4 devices) and need a new controller if I want to add anymore. And, I have to remember to be careful about tying up the IDE bus attached to my CD-RW when I'm burning discs. I can't see the last point being a problem with SCSI.
FreeBSD 4.3 flirted with turning off IDE write caching. This reduced write bandwidth to IDE disks but was considered necessary due to serious data consistency issues introduced by hard drive vendors. Basically the problem is that IDE drives lie about when a write completes. With IDE write caching turned on, IDE hard drives will not only write data to disk out of order, they will sometimes delay some of the blocks indefinitely when under heavy disk loads. A crash or power failure can result in serious filesystem corruption. So our default was changed to be safe. Unfortunately, the result was such a huge loss in performance that we caved in and changed the default back to on after the release.
[...]
There is a new experimental feature for IDE hard drives called hw.ata.tags (you also set this in the bootloader) which allows write caching to be safely turned on. This brings SCSI tagging features to IDE drives. As of this writing only IBM DPTA and DTLA drives support the feature. Warning! These drives apparently have quality control problems and I do not recommend purchasing them at this time.
So, SCSI is better both for performance and for data integrity.
IDE drives are fine in a desktop machine. It isn't likely to be heavily stressed and any reads and writes are likely to be from a single application at a time and a single user at a time with a CPU that is typically 99% idle. Such a user doesn't need the benefits of SCSI and the additional costs that the marketing people add.
If however you have 100 people all accessing different pieces of the disk, some reading some writing then IDE will just not cut the mustard. It requires too much CPU involvement. With SCSI the CPU just says here you handle this to the SCSI interface and gets on with something else instead. In addition, with SCSI I can have 15 devices on a single bus, with IDE, I can have 2.
So basically:
SCSI = scalability & heavy loads.
IDE = low cost & single user access.
Use the one appropriate to your application. For most people that'll be IDE, for other people chucking a lot of data around and lots of processes doing different things, SCSI would be better.
Just a quick rant about laptops. People think that a 1GHz laptop is as fast as a 1GHz desktop. It isn't. The laptop disks are designed with power management in mind and are often significantly slower than normal IDE even. So if your managment think that everyone should have laptops, tell them not to complain when their Oracle client runs like shit.
Government of the people, by corporate executives, for corporate profits.
I haven't seen any IDE controllers that sport a 64-bit/66 MHz PCI bus interface. SCSI already has PCI-X dual-channel U160/U320 controllers. Check out LSI Logic
IDE RAID is fine, it's cheap, but with newer IDE drives pushing 50 MB/sec (sustained) you could max out a standard PCI bus with three drives. Need more throughput? Then you're stuck waiting for PCI-X IDE RAID controllers, or at least 64-bit/66 MHz versions. And in the meantime, SCSI will just get faster.
I've been a SCSI bigot since my Amiga days. Just 15 short years ago, all that was really available for consumer-level computers was SCSI, ESDI, and ST-506.
ST-506 was hardly an interface at all. You had to tell the BIOS the number of cylinders, heads, and sectors the drive had (sound familiar?), so that it could do the multiplication and convert logical block addresses into positioning information for the drive. You also had to enter the bad block list by hand, printed on a sticker affixed to the drive. An ST-506 interface was available for the Amiga-2000, and setting it up was predictably a bear.
SCSI saw its first consumer deployment on the Mac, and Amiga got it not too long after. No more CHS crap. No more typing in lists of bad blocks. All that intelligence was on the drive itself. Just plug the drive into the chain, tell the OS what SCSI address it had, and you were ready to start partitioning and using the drive.
So when it comes time for PCs to get intelligent drives, SCSI was the obvious choice. But no, they invent this new thing called IDE. What was different about it? As far as anyone could tell, the cable. You still had to feed CHS addresses at it; SCSI used LBA from the start. IDE drives from different manufacturers wouldn't work together; SCSI mandated interoperability. IDE now let you have two drives in your machine; SCSI already allowed up to seven.
IDE was touted as much cheaper, but it wasn't. SCSI and IDE drive prices were at near parity for years. Manufacturers were offering drives in both IDE and SCSI flavors (all other characteristics identical), with the SCSI flavor costing only ten dollars more (for a $600.00 drive, a typical price in those days, this was epsilon). It's only in the last few years or so that SCSI drive prices have skyrocketed for no readily discernable reason.
Add to that the fact that, even on a modern SCSI controller, all your old drives will still work. I have an old 600M 5-1/4-inch full-height Hewlett/Packard drive with a SCSI-I (asynchronous) interface. I plug it into the Adaptec AHA2940-U2W controller in my main rig, and Linux sees and mounts it just fine. Same with all my other old SCSI drives; I don't have to leave any of my data behind. It Just Works.
I also have an HP Omnibook 800CT laptop, which has SCSI built-in. All my drives work on that, too.
Apart from the artificially inflated costs, SCSI's only real headache is bus termination. But aside from that, the increased speed, flexibility, expandability, and reliability, for me, make SCSI an obvious choice.
Schwab
Editor, A1-AAA AmeriCaptions
IMHO, the SCSI bus system is better than everything IDE/ATA can offer to date. It's not necessarily the devices that need to be put up against each other. Most recent SCSI disks in "acceptable" sizes are so expensive that you can easily build a RAID system from IDE disks for the same or even lower price. However what's really bad about IDE is the short bus. Face it, length and size do matter in some cases.
You can have a 12m LVD-SCSI bus with 15 devices plus controller running at full speed. But that's not desktop. You'll have trouble just cramming the disks in your average-sized tower, and you still need one or two additional PSUs to get them spinning. And now you take the sucker out for a LAN; but don't forget calling your chiropractor and get a reservation for the next two weeks straight.
Then there's IDE. With todays U-ATA133 specs you're limited to, like, 50cm bus length. Heck, that's about the height of a midi-tower! But it gets the job done. But no external devices for you, sorry. And you're down to 4 devices on your average motherboard, but most users can live with CD-ROM, CD-RW and one or two disks. With onboard RAID controllers coming up, there's an additional four disks possible and you can even plug in a separate DVD drive. You don't need a nuclear plant to get it running, you have lots of storage for a desktop machine and you can still carry it around. Perfect.
To sum it up, I think SCSI is still great, but it's losing on the desktop nowadays. The disks might last longer, it might be more flexible, but in the end, it's way too expensive and overkill. And then there's serial ATA on the horizon.
Fight hunger. Filet a politician and send him to a 3rd world country of your choice.
Au contraire.
Apple didn't stop using SCSI as standard equipment because of its speed. They used it in their Macs for YEARS because of better speeds than any drives of the time. Apple chose IDE later (when Job returned) for reasons of cost, just as PC makers do. Removing SCSI as standard brought down Mac prices by a few hundred dollars.
For general daily use, and because of recent advances in IDE, there was no advantage to using SCSI as standard any longer for Apple.
However, SCSI, particularly the LVD (SCSI-3) will SMOKE any hard drive interface today, which is why Apple still equips various SCSI configs on build-to-order workstations and their Server models.
FireWire (1394) is theoretically as fast as SCSI-3, but few people can afford a true FireWire drive with genuine FW controllers (earlier FW drives were some IDE or SCSI to FW translator or used slow drives on a FW interface).
Apple is overdue to upgrade their logic boards (motherboards) to the faster buses found in the best PC boards now, so there should be improvements in their performance for that platform in the coming months.
Vos teneo officium eram periculosus ut vos recipero is.
Ummm, 600,000 hours is about 68 YEARS. I think I can manage crashes 68 years apart. Even if you worst worst worst case it to 100,000 hours you are at over 11 years.
All this means is that for any given drive in any given year, you have a 1.4705% chance of your drive failing, on average.
If you have 68 drives in your system, then, it is likely that one will fail per year.
That's stats for you.
So higher MTBF can actually work out to better reliability.
Of course, this is a simplistic analysis, which doesn't take into account the actual distribution of mortality for those drives (which, for any hardware, tends to have the stillborn/geriatric ends of the spectrum with the most failures)
Simon
Coming soon - pyrogyra
SCSI as a protocol is far superior (in terms of performance design, connectability, intelligence, and fault tolerance/scalability) than IDE (which essentially acts as a glorified signal converter). Regardless of what any benchmarks attempt "prove," SCSI does not present an overhead which inherently degrades single-user performance. Given the same drive mechanics and comparitive channel rates (ie. 80mb/sec - 160mb/sec LVD, or FC), SCSI disk performance will meet or exceed IDE disk performance, for any given single user application.
When you begin to involve more complex and real-world use of disk drives, the difference becomes tremendous. Think faster disks (15k rpm Seagates), switched fiber interconnects (running >200mb/sec), spindle synchronization, and intelligent command queueing. The added cost (which is usually insignificant compared to the cost of downtime, delayed I/Os, and maintenance) becomes a non-issue.
99% of the high performance computing industry chooses SCSI over IDE as their block device interface, time and time again, and there is a reason. To do so otherwise demonstrates a fundamental misunderstanding of storage interface technology.
A government is a body of people notably ungoverned - AC
Look at "TCQ" -- Tagged Command Queueing -- that has been worked on by Andre Hedrick in the past, and is currently going into the Linux 2.5.x kernel series due to the work of Jens Axboe.
TCQ is where SCSI gets a lot of its speed, by allowing multiple device commands to be outstanding on the bus at any given time. TCQ really levels the playing field for IDE and SCSI... assuming your IDE driver supports it (most do not).
I'm confused.. you say you have 2 drives, striped, and then talk about copying big files between them? IF they are striped they are one volume, and you can't copy things between them.
That was a mistype on my part, and what I meant to type was that while I have two 60GB drives off of a RAID controller, I haven't taken the plunge and striped them yet. As such they're both hanging as the single drive on their own bus, on two separate buses obviously.
I think the only reason IDE is more cpu intensive
It should be pointed out that while this is constantly restated, repetition doesn't count as evidence. It was ironic that just prior to seeing this debate, I saw this page which shows significantly higher CPU utilization for the two SCSI drives (mind you, they're extremely high performance drives, however they're not of a scale that would justify the difference between them and the IDEs). Each new time I replace my workstation I go through the whole IDE versus SCSI debate because I want to go with what's best (SCSI just has an air of superiority around it, much like Honda enthusiasts feel about their 115 lb-ft of torque VTEC engines : Enthusiasm, again, doesn't indicate that it's rational or based on any truths), but it seems that, firstly, it's extremely hard to find cold hard facts on the matter (i.e. basic metrics. Most of the evidence is anecdotal or based on uneven systems), but secondly that a lot of SCSI enthusiasts are very emotional about it. I have zero faith in anyone's personal opinion about the "feel" of one over the other: I remember back in the BBS days when a program made the rounds that promised to "convert your 386 to a 486!" and people would argue with me and ASSURE me that, yup, it made their system that much faster and smoother. A little persuasion and predisposition goes a long way when it comes to subjective measures, which is why I usually discount them.
Firewire is 400Mbps, which is 50 MBps. That's faster than Ultra2 SCSI, but slower than Wide Ultra2, Ultra3 and Ultra160/320 SCSI. Check out this link for details. Firewire is still nice tech, and a fair bit smarter than USB2.0, but it's not the bandwidth king that SCSI is.
Ita erat quando hic adveni.