SCSI vs. IDE In The Real World
An anonymous reader writes "Gerard Beekmans has a really good comparison of the speeds of IDE and SCSI drives up over on devchannel.org. Should help put an end to the myth of IDE erasing SCSI's speed advantage." Note that Beekmans' test handicaps the SCSI disk a bit, with interesting results. (DevChannel, like Slashdot, is part of OSDN.)
a really good comparison of the speeds of IDE and SCSI drives
Oh please. With all due respect to the submitter and Mr. Beekmans, this "comparison" ignores all sorts of other factors: write caching, command overlap, rotational speeds, et al ad nauseum. Yes, some of these are mentioned but a comparison such as this should have hard numbers in a table not opinions. Not that I'm suprised or upset that SCSI trounces IDE, but his comparison is virtually meaningless.
There are many benchmarking suites out there, I'd suggest these be used for the next test to provide some meaningful results.
Trolling is a art,
and SCSI for servers. It is that simple, it will stay that way because of cost, not because of speed.
-Seriv
This article show that scsi drive have a considerable advantage over the same spindle speed of ide drives. Laptops tend to have slower drives. Has anyone considered using scsi drives in laptops?
Does anyone know fo laptops that use scsi drives?
-Mary
In the real world, you must also take into consideration different file size ranges, tree structures, and file systems. Comparing two hd technologies while keeping these factors constant isn't very "real world" to me.
Robert Bindler
A Computer Science student's views on technology.
Negligable? 7 minutes to 28 seconds is negligible? What was the Columbia reentry? Almost great?
And just how exactly does it "all even out" in a RAID setup? IDE RAID and SCSI RAID are still two very different animals...
I can't believe this kind of bullshit gets posted on Slashdot. For those who didn't read the article (and I know you're out there), the guy compared how long it takes to open his maildir file in Mutt on SCSI and then IDE.
Since it went faster on his SCSI drive, he concludes that SCSI is faster. Wow! How comprehensive!
If Slashdot keeps this up, I hope they start to get a reputation like Tomshardware.com (those people are full of shit as well).
Tape drives are like this, too. They look the same, they act about the same during the write process, but the cheapie drives that come with some servers will fail to reread the tapes if they're reused as constantly as they are in most businesses (who, on average, reuse the same weekly tapes for a full year or more!). Better to put the money into a DLTtape solution than to rely on what's bundled with the server.
Try not. Do or do not, there is no try.
-- Dr. Spock, stardate 2822-3.
The document lists one and only one case. I don't doubt that SCSI has performance benfits that is pretty well known. I've always wondered why they don't upgrade IDE with a better command set much like SCSI, well they haven't they just increase the clock speed and offer better buffering. So there is a valid case for a comparison between SCSI and IDE. This review does one and only one test which proves that SCSI wins on one test this is not a good article. He reads one and only one file. The real question is how well IDE and SCSI operate under real multi-treaded OS conditions. Slashdot editors should be rejecting this article in favour of one with a real indepth analysis. SCSI will win but not for the reasons listed in this article.
Negligble? Umm, when you can unpack a kernel in a third of the time and see a 6 and a half minute difference in large reads these performance gains are not negligble. If this was a hairline race that was a matter of a few seconds I could understand, but anyone who does work that is disk intensive will benifit from scsi.
Later,
Phil
...is the original creator of Linux From Scratch, and therefore registers very high on all standard 7331-meters
Tubal-Cain smokes the white owl.
And in the REAL real world, the author of this piece discovered that, for his application, the SCSI drive was at least 300% faster.
Why isn't his test, done with real world data, not a 'real world' test?
It's insane... SCSI is worths it's money... I just don't have the money... ;-)
Ahhh...the great dumpster continuum. Many a free computer will be found there. -- sowth (748135)
That "benchmark" was ridiculous. "I have this two-year-old IDE hard drive and I'm going to benchmark it against this SCSI drive. Woop, look! It read my mail directory faster! SCSI must be better!"
Look, I'm not denying that SCSI is faster. But he neglected to even do any other tests! He also neglected to use a newer IDE drive, which hampered the IDE performance dramatically. (Who's going to use a 2MB cache IDE drive in any area where hard drive performance is critical?)
Personally, I'd like to see the test of an IDE RAID array running off a 3Ware card. For the price of one SCSI drive, you can get 3 8MB cache IDE drives, plus the 3Ware card. Oh, sure, it will probably still be a bit slower than SCSI. But at least the benchmarks will show some sort of logical comparison (and the benefit of IDE -- namely, tons of disk space.)
Is it just me, or have the articles posted on Slashdot recently been pretty lame? I just don't understand how some of this stuff gets posted to the front page. This is not a review. This is not a benchmark. It's one guy who tested one application of hard drives and made a conclusion based on that test. This type of stuff can be found in any newsgroup or forum on a daily basis. It should not have been posted to the front page of Slashdot.
Simpli - Your source for San Jose dedicated servers and colocation!
The reason for this being that SCSI handles far more of the overhead of managing the disk on the controller than IDE, which left much of the work to the CPU. Of course, this technological gap has narrowed considerably with the evolution of IDE into EIDE and now ATA drives.
I have to confess, I'm a die hard SCSI fan when I can justify it (although I might be swayed by second generation SATA). While the real world performance gap of SCSI-vs-IDE is long gone, SCSI drives are still synonomous with servers, which usually translates into a more robust product. How much is *your* data worth compared to the SCSI price premium?
UNIX? They're not even circumcised! Savages!
He tested a 40gb IDE drive versus a 9gb SCSI drive, both 7200 RPM. The SCSI drive was a lot faster, but this isn't any particular shock; this is pretty old hardware.
:-)
Basically he just told us that circa 2001, SCSI was faster. I think we mostly knew that already.
It would be a lot more interesting to see the test run with one of the 36gb WD Raptors. They are 10K RPM and are *very* fast drives. I use a pair of them striped as RAID 0 in my main desktop; they're faster than anything I've ever used before, including 10KRPM SCSI. (I haven't used 15KRPM SCSI, which I imagine is probably faster still, but very noisy, which is why I went with the Raptors. )
Note also that IDE drives in general are "tuned for desktop usage patterns". I'm not entirely sure what that entails, but I suspect it involves a lot of read-ahead caching; single-user systems tend to be actively reading only one or two things at a time. SCSI is tuned for server performance, and the test of "read lots of small files" is probably much closer to a "server" load than to a "desktop" load.
What I'd like to see is testing of streaming performance in working with really big files. That's something I do fairly frequently. How fast can you extract, say, a 500MB RAR file back to the same disk? How fast is it if you're reading from one and writing to a second? On a personal basis, I do that a lot more than putting 50,000 files in a directory and then reading every single one of them.
However, if I ever DO plan on putting 50,000 files in a directory and then reading all of them on a frequent basis, I'll be sure to choose SCSI.
2.2 Ghz processor with IDE drive outperformed by 750 Mhz processor w/ 3 year old SCSI drive of similar specifications (same size, spindal speed, smaller buffer) by a 7 times margine.
.22 and an RPG, I would think that a more appropriate comparison would be a .44 automag to a .22 revolver. The .22 is less likely to bother the neighbors. The .44 automag is more likely to stop the rampaging bull. Which is more appropriate for use will depend upon the use.
Note also that the IDE drive was used exclusively for this test at the time of the test, and the SCSI drive was in a server which was active doing other things as well.
I would think that the 50,000 message folder would be of a wide variety of file sizes. Though it would be really easy to create such a folder of all one file size, simply by running a script that creates that many simple message files with the word "hello" in the subject and the body. As a developer for "Linux from scratch" I would suspect however that this is his message archive, which is likely to contain anything from a "this package sux" message on up to messages carrying a significant portion of the source code to Linux.
As to your comparison of a
For my own needs an IDE drive works well. Then again I don't build and install a new Linux kernel every couple of days either.
Obviously your milage may vary.
-Rusty
You never know...
Yes, it does matter. The applications that he tested did not include cpu usage, which are assumed to be neglegable. Try doing an IDE Raid setup while your cpu usage is already high (e.g., a cpu and disk intensive game or running many applications in virtual memory). The simple fact that the scsi controller handles the io operations will cause your performance to increase even more than the article suggests. Now, do you need to spend 500+ dollars to store mp3s and your pron? Probably not.
The world is a comedy to those who think and a tragedy to those who feel.
What is this? Holy war week on Slashdot? In the last week or so we've had stories on BSD vs Linux, Linux vs Solaris, PHP vs Java, Exchange vs Sendmail , x86 vs PPC and now IDE vs SCSI. All that's missing is Vi vs Emacs and I think we'll have pretty much every major computing disagreement covered.
From the is-this-worth-wasting-the-time-on dept:
My Pentium 4 is faster than my Apple ][. I did benchmarks!!!!!!!
One would expect the SCSI drives to consistently wallop similarly configured IDE drives (same buffer, spindle, size, #heads and every other physical characteristic you can think of) based solely on one observation: Tagged Command Queuing.
TCQ allows a drive to execute commands out of order to optimize the access pattern. This can have a HUGE impact on performance. Relatively few drives support TCQ on ATA, and very few chipsets support it as well. This is mostly because people who buy ATA aren't *real* performance freaks. They want high streaming performance (like hdparm -tT), but don't know to care about random access performance as it may not be relevant to them.
Server/database access patterns are far more random than typical desktop usage, and this is where SCSI wipes the floor with ATA.
Some have pointed out that RAID enclosures are moving towards IDE drives. This is due to the fact that the integrators are using optimizing logic in the controller to handle emulating TCQ. So you can have a stone-dumb drive in there and it doesn't matter as long as the physicals are there.
SCSI drives also typically come with caching algorithms which are intended to try to increase cache hits by using more intelligent cache allocation and predictive reading.
Combine that with better, more intelligent controllers, command detachment, and infinitely better bus sharing - and SCSI cannot be compared to ATA in high demand situations.
I have done similarly-informal tests myself, with comparable results.
I can't say I understand why SCSI performs so much better than IDE, however. In this particular test, he compared what amount to evenly-matched drives, specs-wise, and even gave the IDE drive the better machine. Yet, the SCSI drive completely crushed the IDE drive, no question about it. And as I mentioned, my own informal tests have shown the same results.
What explains the difference? Same spindle speeds, similar read rates (both buffered and unbuffered), similar seek times... What other factors exist that make so much of a difference? Just higher quality controller hardware? And if so, would an IDE drive on a high-end controller perform comparably?
Personally, I'll still take 4x the size for the same price, since cost and size (with "okay" performance) matters more to me than raw speed. But I wish I knew why one performs so much better than the other.
(let me help get you started)
vi? Emacs? What are these things of which you speak?
Ed is the standard text editor!
The second - CPU utilization. A SCSI controller does a lot more work by itself than an IDE one - therefore it requires much less interaction with the CPU.
Then we have the bus bandwidth - this is probably no longer an issue, as ATA/66/100/133 can pipe enough bytes per second
Finally, the most important one - manufacturers simply don't make a 10k RPM hardrive IDE drive ... And 10k - 7200 makes a hell of a difference.
The Raven
The number of tracks a drive has is inconsequential to the access time. The time it takes for the read/write head to move across the platter (since the platters in ide and scsi drives are nearly the same size) is a product of how heavy the heads are and the strength of the motor assembly (which is actually a pair of coils moving between two very strong magnets). I'm inclined to believe that the IDE drive's slower seek performance is actually because if a cheaper head/motor mechanism. After all, the IDE drive _is_ cheaper, so this is one place where costs are cut.
As far as buffering is concerned, random access reads, such as when reading in many small files, does NOT benefit in speed from larger on-drive caches. That cache is used mostly for multi-block read-ahead and write-back on sequential accesses, and is usually segmented into "areas", where sections of cache memory are linked to sections (cylinder groups) of the platters. The more cache you have on the drive, the less "thrashing" the drive has to do when many disperse files are written to, since it can group many operations into their appropriate "areas" of cache. This doesn't help as much for random reads (like maildir!) as for sequential reads.
So why does scsi really have an advantage?
1) tagged queuing. The host controller can queue up to 255 (depending on the controller and drives) commands to each drive, and can request the drive to "update" the controller on its status completely asynchronously, and in groups of responses. With one single "message" from the controller, the scsi drive gets up to 255 times as many low-level commands (block reads/writes/seeks/cache ops) as one command between an ide controller and its drive. For the test in that article, the host OS simply had to read the directory, and send off a bunch of commands to the SCSI controller to read a list of blocks, and notify the host when it was done moving everything into memory. Meanwhile, the IDE drive was busy doing every operation one at a time.
2) multi-processing-friendly communications protocol. Because of the way scsi offloads most of the processing from the host, the host can spend less time waiting for an interrupt to complete (remember, an IDE device can only complete one low-level command at a time, per interrupt), and group commands from many processes together into one stream of scsi commands, or multiplex commands to multiple devices on the scsi bus in one operation. IDE is incapable of sending commands to two devices on the bus simultaneously. i.e. It must wait for an operation to the master drive to complete before sending something to the slave on the same channel, or to a drive on a different channel. In the test, the SCSI drive performed so well even though it was on a 'server' sharing resources with other processes because the OS was able to group commands together from all the different processes, while the IDE system had to "pause" programs that were waiting "in line" for ide operations from another process to finish. These processes can also include kernel disk read-ahead threads, write-back buffering, other programs on the system, etc.
IDE will eventually have tagged queuing (some controllers/drives already support it), but there's still the problem of the IDE controller holding the host in an interrupted state (interrupt contention) while the operations complete. It's how the IDE spec works. DMA helps a lot here, but only makes the data transfer faster. The time the various low-level commands take and how they tie up the host cpu is still the bottleneck. To be fair, SATA should have all this stuff covered, and will likely bridge the gap between IDE and SCSI performance, but I don't believe it'll approach the overall reliability and robustness of SCSI.
Disclaimer: I've described the advantages/disadvantages above as best as I could recall. corrections welcome.
of course its not a real review/benchmark.. look at the reasons for the test.. he plainly states:
"before my wife would allow me to"...
the whole point of the testing was to convince his wife to let him buy one.. and she most likely was asleep at "integrated IDE controllers".. apon waking up, all he had to say was "From my testing I concluded that SCSI being faster than IDE is not a myth. It is very much a reality." and obviously got the go ahead
remember, in the immortal of homer simpson "facts, schmacts... facts can be used to prove anything that is even remotely true"
No. It's NEVER been that way in ATA. Not even the earliest IDE drives. With MFM drives before IDE, and on the Apple ][ and C-64 this sort work was done, but IDE has never been anything like this.
With ATA (a.k.a. IDE), you write 5 bytes to registers to indicate the starting sector number and the number of sectors you want. Then, you write to the command register to transfer control to the drive and it begins working on your command. All modern systems will (usually) issue the "read multiple" command, which instructs the drive to read many sectors into its buffer and give an interrupt when they are all available in the buffer. This isn't something new. The read multiple command has been in the ATA specs for a long time, and PCs have made use of it since at least the days of Windows95 and Linux kernel 1.0. When the drive has all the sectors in its buffer, it asserts the interrupt pin. The read multiple command comes in PIO and DMA flavor, and if you wrote the DMA version to the command register, a DMA operation happens to transfer all those sectors to whereever you set up the DMA controller to store them.
SCSI gets most of its advantage from tagged command queuing and disconnection. These features have appeared in the very latest IDE specs, and so far very few ATA drives support them.
PJRC: Electronic Projects, 8051 Microcontroller Tools
As this "article" painfully demonstrates, we need the ability to moderate things on the front page.
If the editors cannot distinguish what is trash or what isn't, let the community decide.
Thank you.
A witty saying proves you are wittier than the next guy.
The SCSI cd-rom was a quad, and the IDE a 6X. It took 30 seconds to load a level on the 6X IDE system and 4 seconds to load on the quad SCSI system...
You see, here's the problem with armchair benchmarkers, such as the site linked by Slashdot, and your, er, benchmark - You do realize, of course, that a 6x CDROM has a throughput of a blistering 1MB/second, right? That even on traditional IDE the controller subsystem sat around waiting for data about 97% of the time? The idea that there is any measureable difference between interfaces at such an absurdly low throughput, even accounting for massive interrupt overhead (such that classic IDE had, but modern IDE doesn't) would be just a blip on the radar. Your methodology is crap, and the more likely explanation is that the IDE drive had a physical problem such as overspeed or a focusing issue.
The ONLY advantage IDE has is price. End of story.
Wow, I guess we might as well wrap this whole discussion up right now!
Okay, I know it's bad form to reply to your own post, but I thought I'd add the results of a little Pricewatch search I ran.
I said in my earlier post that you can get 3x8MB cache drives plus a 3Ware IDE RAID card for about the cost of one SCSI drive. Here are the actual cost breakdowns. All prices include shipping.
IDE SYSTEM
1 x 3Ware 7500-4 4-port RAID card: $250
3 x Western Digital WD800JB hard drives (IDE; 80GB; 8MB cache) = $219.
TOTAL for IDE system: $469.
Total usable space: 160GB.
Bonus points for RAID-5 redundancy.
SCSI SYSTEM
1 x Adaptec 29320 U320 SCSI adapter (64-bit PCI card): $179.
1 x 73GB U320 10,000RPM drive by Maxtor: $296.
TOTAL for SCSI system: $475.
Total usable space: 73GB.
Bonus points for raw speed.
Now, if you want to run a real benchmark, pit these two systems against each other in a nice server (dual Xeon preferred.) Make sure it's the same hardware being used. This would be a benchmark that I'd be interested in seeing the results from. I'd take any test pitting these types of systems against each other a lot more seriously than any "benchmarks" in the current article.
Simpli - Your source for San Jose dedicated servers and colocation!
Check out storagereview.com
Great drive reviews, the best out there..
At the moment, the best scsi drive has about a 2x lead over the best IDE drive in "Server style" loads, and about a 20% lead in desktop type loads.
Note that this really isn't an interface issue, but a market issue. With tagged command queuing in serial ATA, one of the main reasons for SCSI's dominance is gone. Unfortunatly, no enterprise class drives support it yet.
The difference between SATA and SCSI is market.
The fastest SATA drive goes for $160, while the fastest SCSI for about $700.
SCSI drives are manufactured for the "no compromise" audience, and are therefore traditionally faster and more reliable.
SATA puts IDE drives in the same interface class as SCSI, and more "enterprise class" drives are starting to be built with that interface.
Given a well-built SATA drive that includes all the SATA features like TCQ and drive with the same build quality in SCSI, I bet that the difference would be minimal. There are no comparable products at the moment though, so time will tell..
Blessed are the pessimists, for they have made backups.
I got a Raid 5 ide setup on the Promise SuperTrak Sx6000, and that 3disk array doesn't even come CLOSE to touching the performance of some of my SCSI arrays at my workplace. Do one task, and it's fast. Start any amount of multitasking, and it dies, quick. SCSI would never fall over so badly.
Granted the large caches make up for it, but comparing speed of IDE vs. SCSI is as pointless today as it was 10 years ago. (try finding cheap reliable 15K RPM IDE drives). Price/performance, maybe IDE wins depending on who you are...
I get better real-world performance from my Ultra-2 SCSI drive _over NFS_ than I do for my local ATA/100, and the SCSI disk itself is about 5 years old.
That says something.
"Sometimes, I think Trent just needs a cup of hot chocolate and a blankie." -Tori Amos on Nine Inch Nails
Pieroxy wrote: Dude: 7 minutes vs 28 seconds. That's more than 1:14!!!
Dude: This comparison leaves so much out that it is completely meaningless.
I have actually been part of benchmarks that had such a wide disparity in performance. But we could back up our results with rigorous benchmark standards and results. The benchmarks involved several different database engines on the exact same server hardware, storage subsystem and operating system. All system parameters, data load processes, operating system optimizations, DBMS tuning and queries were carefully documented and reviewed by independent SME's. Benchmark results were documented and reviewed after being repeated multiple times for each system. At the end of it all we had a benchmark that was acceptable to our customers, the engineers and the PHB's.
This so called benchmark would be laughed out the door. SCSI generally performs better for I/O intensive scenarios and that can be proven with appropriate benchmarking. But this "benchmark" was not rigorous or thorough and proved nothing.
In my universe I'm perfectly normal, it's not my fault you don't live in my universe.
However, the test is about as bogus and incomplete as the 2.6.0 vs. 2.4.x vs *BSD tests earlier this week.
Old, crufty files on IDE, all over.
Good test would move the files to an empty, freshly formatted IDE drive.
And to an empty SCSI drive (he did just the latter).
And SCSI will be faster and the test will be better.
I have mail scattered across a crufty barracuda. It was NOTABLY faster when I tarred it up to move it to a fresh disk for /home. The exact SAME disk (but without 2 other partitions on it).
So all my files were together and contiguous and on the outer sectors.
RE: Note that the controller is an Ultra160 and is a 64-bit card put into 32-bit PCI slot. The drive itself is an Ultra320. The speed increase would be higher if I were to purchase an Ultra320 controller with a motherboard that supports 64-bit PCI slots.
'scuse me while I wipe up the milk I just blew out my nose.
Yes, in theory this would be faster in a 64 bit slot. And you will run at full speed until that cache is empty (think gazillionth of a second). You would gain if the bottleneck were that pesky 32bit PCI slot. But its not. After the cache burst is done, you are limited by the disk speed. And a 5400 RPM disk will not put out more than 8-10MB/s in real use (dd is NOT real use).
I *do* use Ultra160 SCSI on RAID boxes that contain 15-20 15000 RPM disks and several hundred MB of battery backed read and write cache. And we get a (real world) 80-100MB/s throughput (again, dd(1) is not real world).
It's a pity, because doing these tests CORRECTLY would have been worth while. And coming from ! Tom's Hardware, (just which manufacturers are funding them?) is a good thing. But this is bad science.
I don't even care which one is faster because to me SCSI is just much more sexy than IDE[period]
I wrote this months ago using newer hardware and comparing RAID arrays as well:
Comparing Hard Drives
It was really meant to be a comparison of brands, RAID vs single drives, and SCSI vs. IDE. At the time SATA wasn't out yet and I couldn't get any SATA drives to add to the comparison later.
-Jemis because he just shelled out $700 on his new drive.
so, he has to make it out as good to prove to himself what he just bought will beat ide.
also, not to mention he never stated brand names.
certain manufacturers are very different.