The Ultimate Linux Box 2001
savaget points to this Linux Journal article which covers building a superior personal computer for general usage. See if you agree with the choices that Rick Moen, Daryll Strauss and Eric Raymond made in building their dream box.
My budget doesn't allow ultimate boxen... I'd be more interesting in seeing information on ultra-cheap (but still decent and reliable) systems. An older guide exists, but it hasn't been updated in a long time.
Incase you are wondering why he is building two, it clearly states that one is for him, and the other is for Linus Torvalds!
Lucky dog.
Exactly how do you cluster IDE drives??? With SCSI I can share the the same bus with 2 different computers, and can present the same disk to two different systems at the same time.
o n1 1
a sc si.pdf
i to rial.html?prodkey=io_comparison
9 /
IDE is *only* good in a single drive / single controller situation; but at that time (from most drive manufacturers websites) you are only able to push maybe 35MB/sec. So your so called controller latency is NOT an issue. Agreed IDE will perform the same on a single drive system, but as soon as you add another drive onto that channel you've possibly halfed the performance of those two drives, you could add another controller, but really starts getting rediculus (I've got one systems with over 300 drives connected to it, I'd like to see an IDE system keep up with that)
There also are quite a bit of things in the SCSI protocol that you are looking over. Command Tag Queueing is a very big one, I can send multiple commands down the SCSI chain and the drive can re-order them so that the drive can streamline where it's going to be getting data off of the drive (setting this gives a significant performance boost on our arrays). Along with the fact that IDE is completely and totaly CPU driven, try really pushing your CPU and you are either going to have to give up CPU cycles to your app or give up performance to your drive.
Could you please provide a link to Google's use of IDE drives for all their storage, I can't seem to find a page saying that their Linux are all running on IDE only.
http://www.acc.umu.se/~sagge/scsi_ide/#comparis
http://www.dell.com/downloads/global/vectors/at
http://www.adaptec.com/worldwide/product/marked
http://www4.tomshardware.com/storage/01q1/01012
You're thinking of inexpensive ATA RAID, while they explicitly wanted a SCSI solution for speed. But SCSI RAID is _expensive_ - it's professional workstation class hardware, not within the budget for a personal machine (no matter that they say "cost is no object" - clearly there are limits here).
You either do RAID 0 Mirroring or RAID 1 Striping (with 2 ATA drives) or RAID 0+1 Mirroring and Striping (this takes 4 drives). Full striping plus Parity-Checking is RAID 5 but (someone correct me if I'm wrong) this isn't available for inexpensive ATA disk arrays. It would be nice if it were, but it would be slower than using a couple of SCSI disks and taking regular backup images of them. (What's best for backup is for yet another discussion.)
RAID 5 can be had for SCSI disks, at impressive prices, at which point you're better off with Gb Ethernet or Fibre Channel NAS or SAN storage. To do RAID 5 right, you need (some multiple of) at least 9 disks (8 for data, 1 for parity, with data and parity stripes randomly assigned across the array). The RAID 5 stuff gets rather complicated and expensive (have you priced SAN storage lately? I have, and it runs to 5 or 6 figures to just get started).
I like their approach for a high-end Linux machine for personal use. I'm using something similar as I write this (Tekram SCSI adapter with two 10K RPM Quantum 9GB non-mirrored disks). They're right to focus on I/O speed as more important than CPU power. Net bandwidth is the real limiter.
In this, they're just following what was learned long ago on mainframes: tune the I/O subsystem first because that's where you find large delays, then make sure you have enough memory (since Virtual Storage impacts Real/Expanded Storage, which impacts Auxiliary Storage - back to I/O), then tune CPU allocation and capacity last. It's well known that when you finally run out of CPU power (having tuned in this order) it's time for short-term triage (favoring "loved ones" at the expense of discretionary workloads) followed by an inevitable configuration upgrade. This is how it's done, folks.