Slashdot Mirror


RAID Problems With Intel Core 2?

Nom du Keyboard writes "The Inquirer is reporting that the new Intel Core 2 processors Woodcrest and Conroe are suffering badly when running RAID 5 disk arrays, even when using non-Intel controllers. Can Intel afford to make a misstep now with even in the small subset of users running RAID 5 systems?" From the article: "The performance in benchmarks is there, but the performance in real world isn't. While synthetic benchmarks will do the thing and show RAID5-worthy results, CPU utilization will go through the roof no matter what CPU is used, and the hiccups will occur every now and then. It remains to be seen whether this can be fixed via BIOS or micro-code update."

13 of 284 comments (clear)

  1. Re:Why aren't you running a dedicated controller.. by andrewman327 · · Score: 4, Interesting
    I agree with this. For most people, backing up your data every week is a LOT better option for data security. Users who should be using RAID 5 should also have dedicated controllers.


    Still, this is a problem for Intel. Their products are supposed to do what they do extremely well under all conditions. I hope that they find a way to fix this admittedly niche problem.

    --
    Information wants a fueled airplane waiting at the hangar and no one gets hurt.
  2. Re:Why aren't you running a dedicated controller.. by Anonymous Coward · · Score: 1, Interesting

    Intel might've screwed this up, but it will only affect non-professional IT.

    Or those with budgets.

  3. Re:Why aren't you running a dedicated controller.. by cnettel · · Score: 2, Interesting

    My personal "analysis", is that this sounds much more like a DMA issue, either in chipsets, in the processors, or in OSes. Core 2 is doing some speculative prefetching and a quite different cache management scheme, so some naive ideas would be that some piece of code or hardware got away with doing things improperly before, a very rare race condition might have become commonplace. If that's the reason, it might be easy to fix. Of course, it might also mean that the prefetching or cache sharing between the cores (or a couple of other things) are actually faulty...

  4. Re:Problem by TheRaven64 · · Score: 2, Interesting
    Software RAID 5 does:

    Load byte 1.

    Load byte 2.

    XOR bytes 1 and 2.

    Store result. There are a few things that could be wrong here. The XOR performance could be bad. This seems a bit unlikely but XOR is not an incredibly common operation so it wouldn't slow down too much else.

    It could be that the pattern of data was bad for cache usage. This would be slightly odd, since it should be a series of 4K linear blocks.

    It could be low I/O performance between the chip and the on-board controller. This seems the most likely; there could well be some multiplexing issues too. I would be interested to see what the results are using a Core Solo.

    --
    I am TheRaven on Soylent News
  5. *NOT* just on-board RAID by Anonymous Coward · · Score: 2, Interesting

    From TFA:
    The reason was that there were severe problems when Woodcrest was paired with a 1E RAID field when using IBM ServeRAID controllers. The problems didn't occur just in benchmarking, it was the every-day usage model that produced unexpected errors.

    ServeRAID controllers aren't some cheapo CPU-based RAID, it looks like this might be a more serious problems.

  6. Re:Why aren't you running a dedicated controller.. by lukas84 · · Score: 1, Interesting
    Or those with budgets.


    If you can't afford to do something right, don't do it.
  7. Re:These are the cheesy RAID cards, right? by Billly+Gates · · Score: 2, Interesting

    I was going to recommend 3ware as well. I have done any administration work in years but one of my former employers use them for their servers for that reason. If it dies you can replace the board and we even have a few stored in case of a failure.

    Organizations should look into this and not the vendor for their server for any raid setup. It would be nice if they all did as a server is not a desktop and the data is needed NOW when it goes down.

  8. Re:Problem by ivan256 · · Score: 2, Interesting

    Seems more likely to be a scheduling issue to me...

    Core 0 loads byte 1, Core 1 loads byte 2, Core 1 or Core 2 has a cache miss on the XOR...(Do the cores share a cache?) Or it could be a locking problem. XOR is very common, and it would surprise me if it was slower than on previous intel chips.

  9. Re:Why aren't you running a dedicated controller.. by kimvette · · Score: 2, Interesting

    Actually the market has become so diluted with everyone's jumping into the RAID game (thanks to Highpoint Tech and Intel with their hybrid solutions) that it's becoming increasingly difficult to discern the true hardware RAID controllers from the hybrid models. Of course there are the companies that won't so much as touch software RAID (namely 3ware) but Promise, Koutech, and even Adaptec all are very slick with their descriptions of the controllers and make it unclear as to whether or not their products are actual RAID controllers of if they offload the processing to the CPU. If you want to give a small business (mom & pop or a larger business with a tightass PHB who sees IT as solely a cost center rather than an essential tool to keep things going) better assurance of data integrity than a single HDD will provide, and they're NOT willing to back up regularly, and obviously won't spend $300-$700 for an entry level GOOD RAID5 controller, then a hybrid solution may be all that you can offer them. Given that these controllers are being implemented on motherboards more and more now, the performance they provide has to be reasonable, without hogging the processor.

    Also, when you do find a hardware controller: will it run in your board? In other words, if it's PCI, do you actually have a PCI slot to fit it? This is especially a problem in a high-end consumer box or in a lower-end workstation, where you might have one or two PCI slots and the rest are PCI-E x1 slots. Where you're likely going to have a GOOD sound card and a capture card in your legacy PCI slots, or maybe a multi-port Firewire 400 card, where is the hardware RAID controller going to live? Obviously the solution is going to be to go with an embedded solution on the motherboard, hopefully with a model that doesn't totally suck.

    --
    The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
  10. Re:These are the cheesy RAID cards, right? by AdamTheBastard · · Score: 2, Interesting

    "If your RAID-controller fails, you have to get another controller exactly the same"

    This is why we always by a spare card whenever we get a new RAID controller. That way we know that there will be something that will read the disks and know how the data is setup. Next time you by a RAID controller remember to get another one just like it. Otherwise you might be stuck with disks that will only be read by a card that is out of production.

  11. Re:These are the cheesy RAID cards, right? by Anonymous Coward · · Score: 1, Interesting

    The performance penalty for software RAID is minimal unless your file server is also your build server or render farm, in which case it probably won't matter much because you're accessing data from local disk..

    I have a software RAID 5 with encryption on 12 PATA disks attatched to a dual-core AMD X2 3800. It has over 2.1GB on a single filesystem and can easily saturate 100Mbps for about $2000. It would cost 2x the price to build this system with hardware RAID.

  12. Re:Likely a driver bug by Anonymous Coward · · Score: 1, Interesting

    I oversaw testing of those drivers at Intel in Hillsboro, OR. Believe me when I say, the practices at the WHQL labs and testing for the RAID stuff is scary. When we were doing the bulk of the RAID work on the chipsets, we were transitioning over to a new bug tracking system, nobody knew what was going on, and many many bugs and issues were lost or simply not filed or followed-up on. I raised hell with several people over the situation and the development deadlines - we had to adhere to the workweek schedule no matter what, no matter what, no matter what, the Intel Borg says, and you jump. I dropped it. Personally I am so SO not surprised. I saw this whole thread... my initial reaction was Ha Ha. It's smug face day around the office as this is making the rounds.

  13. Re:Why aren't you running a dedicated controller.. by Salamander · · Score: 2, Interesting

    If it happens with a dedicated controller such as ServeRAID, then my first hunch would be that the chipset isn't handling memory contention very well. We used to see this at Dolphin a lot; the Intel chipsets at the time would behave terribly if there was any kind of serious memory traffic coming from the "far side" of the memory controller. This could also be a problem on the "softmodem-like" RAID controllers, where one core is trying to bring previously DMAed data in for its XOR while the other is trying to do its normal stream of memory accesses. It wouldn't be quite the same kind of thrashing as in the previous case, but it's very easy to imagine that it would still occur.

    --
    Slashdot - News for Herds. Stuff that Splatters.