RAID Problems With Intel Core 2?
Nom du Keyboard writes "The Inquirer is reporting that the new Intel Core 2 processors Woodcrest and Conroe are suffering badly when running RAID 5 disk arrays, even when using non-Intel controllers. Can Intel afford to make a misstep now with even in the small subset of users running RAID 5 systems?" From the article: "The performance in benchmarks is there, but the performance in real world isn't. While synthetic benchmarks will do the thing and show RAID5-worthy results, CPU utilization will go through the roof no matter what CPU is used, and the hiccups will occur every now and then. It remains to be seen whether this can be fixed via BIOS or micro-code update."
If you're running raid5 it's probably in an enterprise setup. If so, why aren't you running a dedicated controller? The CPU should have little to no impact on the raid subsystem...
Seems odd to me that the inquirer is the only one reporting this. How about a real hardware review site?
I don't get what the problem is. Are there specific instructions used often in raid 5 algorithms that are slow on the new chips? Is it bus contention?
MidnightBSD: The BSD for Everyone
Reading the article it's all about software raid and the performance they get.
... )
The interesting question is what other peices of software that we run will get unexpectedly bad performance.
( I have > 2TB of hardware RAID 5 at home so I was wondering
You should be using a controler with a dedicated processor, anyway.
If you don't care about data integrity, throughput, or CPU utilization, like me. Most users will be buying the core 2 for household heating, and not superfluous stuff like data access.
Actually I would trust the Linux RAID5 software setup more than a LOT of the RAID controller firmware setups which I have had no end of problems with over the years including a card that rebuilt an array from the new drive on insertion instead of the other way around! Firmware is after all simply software, and software that tends to get a lot less scrutiny then alot of other classes of software, especially potentially data eating code in a project like Linux or one of the BSD's.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
I'm slightly confused.
... further implicating the mobo. However, similar mobos with AMD processors didn't experience the problem, so there's obviously something going on that's Intel's fault.
The articles are both very light on technical details, and somewhat vague as to what's really going on. (Admittedly, maybe they don't know it.) In the first article, they allude to the problems being the result of the "softmodem"-like RAID systems that modern integrated motherboards use, which would remove some of the blame from the processor. But then they also suggest that the same problem occurs with dedicated RAID controllers (IBM ServeRAIDs -- I think these are dedicated controllers), which don't cause too much CPU load at all
It doesn't seem like it would be that difficult to pin the blame down to the particular component: is it the integrated RAID subsystem utilizing the processor inefficiently? Or is it the processor itself, being slow? And if it was the processor, why wouldn't this slowness be exhibited in other situations?
Seems to me that what needs to happen, is for somebody to do a test with a Conroe processor in a motherboard that doesn't include any of the integrated, offload-work-to-the-processor type of integrated subsystems (RAID, sound, Ethernet), use a 'real' hardware RAID controller, and see what the results are. If there are still problems in that scenario, then there would seem to be something wrong with the processor, and this could be confirmed with simulative benchmarks.
As a criticism of Intel's complete "systems" (processor plus chipset) I suppose this is a valid criticism, but I'd like to see more of a breakdown as to where the performance hit is coming from.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
This sounds like a timing problem -- the processors are too fast, causing the system to slow down.
i ?id=121434)
There was a similar problem that I had to wrestle with on a Linux when runnig 3Ware RAID controllers w/ RHEL3 on fast dual-processor servers. When battery backed write caching was turned on, the fast acceptance of IO requests (by the CPU's and then by the hardware RAID controller) lead to awesome sustained performance for short bursts, but under constant load would suddenly hit a wall and then IO would practically hang. (https://bugzilla.redhat.com/bugzilla/show_bug.cg
What if you have a lot of photos, music or movies - these aren't unusual things these days. I don't want to go rummaging through DVDs to find the picture I want, I want to fire up f-spot and see it there straight away.
RAID5 provides sensible protection against data loss when using consumer hard disks - software RAID5 is readily available on linux and hard disks in the 2-300GB range are easily affordable. You can often pick them up for $50 after rebates. So I can get a TB of storage for a few hundred dollars, but to use hardware RAID5 would probably double the cost. Fine if you're an enterprise, but not fine if you're using it at home.
"I agree with this. For most people, backing up your data every week is a LOT better option for data security. Users who should be using RAID 5 should also have dedicated controllers."
You're generalizing a little too much. For example: I have >1TB storage on my mythtv box (I just like to have a good selection of stuff to watch when I finally get to watch tv, and I'm never at home when the shows I like are being broadcasted), and I'm using software RAID5 on that. That is, software raid5, on shared controllers: All together seven disks off the mainboard, from a mixture of pata and sata connectors. I wouldn't do this on something like a server, but it's plenty fast enough for mythtv. It also gives a lot of protection for the array of disks, and it's a much, much better option than the weekly backup you suggest (first of all, a backup would take ages, cost waay more in disks (which wouldn't even fit in the HTPC), and last but not least: without raid5, if one disk dies, I could lose up to 7 days of recordings...).
--- Hindsight is 20/20, but walking backwards is not the answer.
I agree, it seems on slashdot (and actually, some of my friends) that you're an idiot if you're not running RAID but your equally dumb if you're running RAID5 because it's not a backup solution. It's as if there can't be any gray area in the matter. People make it seem like RAID5 has no purpose or benefit and everyone should just be using striping+backup. To me, the point of RAID5 or other redundancy RAID setups is it's your first line of recovery for a disk failure. If a disk fails, you replace it and you've suffered little downtime. If something major happens then yes, you restore from backup.
My other issue is with people forgetting the idea behind being sensible about what needs to be protected and how much it should cost. There is no reason why my personal collection of photos, music and video should cost me so much. Software RAID is way more than adequate for providing a cheap way to store my files. If data protection AND peak performance are what you need, then yes you need to go full hardware. WHERE'S THE MIDDLE GROUND PEOPLE?
No you're the retard, I have a SAN with all the production data, dual channels, dual switches, bazillion raid arrays, whole nine yards. I still use software raid, for the system disk, nothing on there that needs backing up, all the config files are under version control (repo elsewhere). Its a waste of dual channel fibrechannel disks to store the system image on the SAN, so I use a pair of cheap SATA disks in the system. Raid 1 means I don't have to re-install if I lose a disk, if the fuzz shows up with a warrant for my logs I just hand them one of the mirror disks, cloning a server for a quick cutover is cake, none of which justifies "real money".
I'd also like to see how a Pentium D would perform in the same system seeing as how it's socket compatible with Conroe. This will help isolate whether it is indeed the CPU or if it's more to do with the I/O subsystem(chipset).
Because if your dedicated controller goes you have to find the same make & model of controller. On no notice. Possibly a few years after that make and model has been discontinued.
With software RAID-5, any controller that works with your host bus (PCI) and HDD bus (ATA, SATA, or SCSI) will do just fine.
Most onboard (intel) RAID controllers are only setup for 0,1,0+1, or 10. And not RAID 5. I don't see how it could possibly be correlated to the CPU.
That's because you can do RAID 0, 1 or any combination of 0 and 1 without needing parity data. The performance killer on RAID 5 (and any other form of RAID that requires parity) is in the XOR operations used to compute and verify the parity information. In order for RAID 5 to perform at a satisfactory rate and not totally bog down your CPU, the XOR calculations should be handled on a dedicated hardware controller, not in software.
However, for non-parity RAID setups the amount of CPU overhead is almost trivial, so referring to "fake RAID" or "software RAID" with the integrated RAID controllers on most motherboards is a misnomer. That being said, at least one of these articles is talking about servers using third-party RAID controllers.
It's almost a certainty that this is a software problem of some sort. Driver bugs are the most common source of "hardware" instability, particularly on Windows. Drivers are often written by clueless intern-level engineers, and quickly forgotten once the drivers initially pass based Windows hardware quality tests.
Oh wait, they did. Not only did they say so a long time ago, they publish documentation and maintain a compiler to help you optimize for the way their processors work.
I rarely criticize things I don't care about.
1. You use xor to do *exclusive-or*
2. Talking about faster in context of 16-bit 0x66 prefixed instructions is hilarious
3. No, it is not "faster", it just doesn't need the constant so it encodes the instruction into smaller space, however, that doesn't say anything about being faster. You really need to know the x86 implementation and the relevant details, how it decodes into microcode, is there tracecache, is there instruction macrofusion and so on.
4. xor reg,reg would be relevant in 1990's
5. Decoder bandwidth is not an issue unless the code is generated on-fly and most of the code being executed is 'new' (which doesn't happen very often in binaries which are generated offline)
6. Wow.
Users who should be using RAID 5 should also have dedicated controllers.
Unsupported assertion.
Here's mine: "People who are self-described social conservatives should be beat with bricks."
I like mine better.
The fact is that my friend, who knows more about RAID than everyone else reading this (if you tie his knowledge you are very, very good) told me that RAID 5 in software doesn't take that much CPU and hardware RAID is pointless for me in my NAS.
Look at what the Internet Archive uses (no link because if you can't find it, you are probably a social conservative and should report immediately to the brickyard).
There's no "middle ground", there's cost-benefit analysis.
I.e., is it worth my time to spend $50, $100, $200, $500, etc, and an hour a week to mirror a pr0n collection? Some people would say $50 and 5 minutes, and others would say $500 and 6 hours a week. And some would say, "Chunk it. If the disk dies, I'll just download it all again."
"I don't know, therefore Aliens" Wafflebox1
You use XOR to clear a register. XOR CX, CX sets the CX register to 0. It is faster than MOV CX, 0.
wrong !
The instruction is shorter (2 bytes instead of 3 or 5, depending on the word size). But it sets flags on return while MOV CX,0 will keep them intact, so the CPU cannot reorder instructions around a XOR. In the best case (no dependency), they will all need 1 cycle. In the worst case, XOR will stall the pipeline while MOV will not.
Err, NO! It's about FAKERAID, which is a H/W S/W combo.
RAID stands for Redundant Array of Inexpensive/Independent Disks. Nowhere does it say "Controlled By A Dedicated CPU" ("RAIDCBADC"? Doesn't quite sing like "RAID"). Software RAID is as much RAID as a top of the line server RAID controller with RAM and a battery backup. It isn't as fast, sure, and it loads the system CPU, but it is still RAID. Calling it "FAKERAID" is just pretentious and misleading. The data integrity benefits are still present, as are some performance benefits in some circumstances (in fact, Linux RAID is demonstrably faster in some workloads than a top end Adaptec hardware RAID controller, though this is the exception rather than the rule)
That said, I hate pretty much all RAID controllers (whether software or hardware). Linux software RAID means that I can drop the disks into any PC and access the data. Every RAID controller from Promise, Adaptec, and Tektronic requires me to use their disk format, and if I lose the controller I lose the data until I can get another controller. Sure, in high availability environments, you keep a spare...but with Linux software RAID, every PC in the office is a spare controller. That's my kinda redundancy. I've even had two identical Adaptecs with different firmware lead to pretty massive data loss during a server migration. Thankfully there were good backups. I've never had similar problems moving Linux software RAID disks into a new Linux box.
What about hotswapping? AFAIK, you still can't do that with software RAID and SATA drives... (at least with Linux 2.6.x - never tried with Windows or Mac OS X). A hardware (SATA) RAID controller is your only option, unless you spend $$$ on a SCSI disk subsystem