Slashdot Mirror


Are Bad RAM Chips Common?

A semi-Anonymous Coward asks: "I recently built myself a new system using a mainboard which required using registered DDR SDRAM -- the motherboard will not work with unbuffered / unregistered memory, and I wanted the extra integrity provided by registered memory anyway. To my amazement, both the memory I purchased with the board and one of two other sticks I purchased were either defective or simply incapable of working with the board (which is the Chaintech 7KDD, BTW). About how often do people run into defective memory, and do they see them from the 'reputable' manufacturers as often as they do the 'no-name' ones? Now that I've spent a ridiculous amount of money on this, I'm a lot more wary."

78 comments

  1. When lives are at stake ... by Glonoinha · · Score: 3, Informative

    I have run into bad RAM a few times, I quit buying the cheap stuff and only deal with Crucial - have had excellent luck with them.

    --
    Glonoinha the MebiByte Slayer
    1. Re:When lives are at stake ... by NemoX · · Score: 1

      I agree, Crucial is one of the best ones I have come across, not one dead stick since I started using them in 1997. Of the 12 kingston chips I have come across, 4 of them were dead...so much for their reputation IMO.

    2. Re:When lives are at stake ... by KDan · · Score: 1

      What's the problem? You guys must have really crummy hardware shops. All the shops I shop at will take the memory back, test it immediately and give a replacement if they also find it doesn't work.

      Daniel

      --
      Carpe Diem
    3. Re:When lives are at stake ... by DetrimentalFiend · · Score: 1

      I'm wondering if the person who asked the question got generic ram. What most people don't realize is that generic ram is really the crucial/micron/etc ram that's not good enough for the brand name. Basically if ram doesn't pass a test, then they try to salvage it. If they can salvage the ram then it's sold under an agreement that the re-seller can not tell anyone where it came from. The really crappy stuff goes to answering machines and other similar devices. But the bottom line is that if you buy anything but the name-brand ram, you are just asking for trouble.

    4. Re:When lives are at stake ... by Glonoinha · · Score: 3, Informative

      Daniel - generally it isn't the fact that a chip (or whatever, actually) is bad, it is the hassle associated with a bad chip. I got a cheap (bad) chip for my g/f's laptop and it developed very subtle problems, would lock up from time to time and was not blatently obvious what the problem was. I ended up reinstalling Win98 twice (I was pretty eager to blame MS, to no avail) and after upgrading her to Win2000Pro and still having problems I remembered adding the RAM so I pulled it out. Problems went away.

      The local hardware shops will eagerly replace my cheapo RAM with different cheapo RAM but they can't replace 10 hours worth of diagnostics, lost files, scrambled data, the half hour each way drive it takes to get to their store, etc...

      What happens if the RAM is marginal only at certain temperatures or under certain loads, circumstances they can't replicate on their test gear? You go back to the house and pop it back in and go back to having problems, but this time you are SURE it isn't the RAM so you start replacing other parts (mobo, video card, NIC, caching SCSI RAID controller card) all out of your pocket trying to make it stop blue screening (or whatever) and be a stable work environment ... when it is still the RAM.

      Once you start using the hardware for work the cost (value) of the hardware is negligible compared to the cost (value) of the actual data ... I have had laptops worth $1,000 carrying a half million dollars worth of development code on them. If someone tried to steal that laptop I wouldn't be killing them over the value of the laptop, I would be killing them over the value of the IP contained within.

      --
      Glonoinha the MebiByte Slayer
    5. Re:When lives are at stake ... by KDan · · Score: 1

      I see your point, but on the other hand, faulty RAM-induced crashes are not so hard to diagnose if you're running Linux. Just check out the syslog, and you get this characteristic pattern of page faults and such, and then usually you can be pretty damn sure it's the RAM. Of course, on a laptop that's not really practical, I guess...

      Daniel

      --
      Carpe Diem
    6. Re:When lives are at stake ... by techman · · Score: 1

      try http://www.allcomponents.com

      lifetime warranty and dirt cheap prices

  2. One out of dozens, perhaps hundreds by Faldgan · · Score: 1

    On the other hand, it was one I purchased recently (about 4 months ago) so perhaps it's not an issue to toatal quality, but an issue of quality right now at some plant? It was DDR memory. I've never had a problem with normal SDRAM, EDO or FPM.

    --
    Nathan Brazil?
  3. Hello world, hear the song that I'm singing by Anonymous Coward · · Score: 0

    thrid psot

  4. occasionally by Anonymous Coward · · Score: 4, Informative

    i have occasionally run into bad memory. a very handy utility can be found at http://www.memtest86.com to verify that your memory is bad, and the specific address ranges that are no good. you can then specify those address ranges to the linux kernel and applications will not be able to malloc the bad memory, thus running stably despite having bad ram.

    1. Re:occasionally by Sepper · · Score: 1

      From a sys admin point-of-view, memtest86 is REALLY useful. It can save you A LOT of headaches.Ex: I was able to install windows 98,by using the boot cd, on a system with defective memory.(and it installed properly, but could not run for more than 5 minutes) It took me some time to find the exact problem...And i found it when an old DOS boot disk failed to load himem.sys because of "error at adress X".(now i use the LNX-BBC cd, which comes with memtest)

      At least, Windows is not that much bug-ridden anymore...

      you can then specify those address ranges to the linux kernel and applications will not be able to malloc the bad memory

      Any possible way to do that with another OS?

      --
      I live in Soviet Canuckistan you insensitive clod!
    2. Re:occasionally by Anonymous Coward · · Score: 0

      Being at least a little robust to bad memory is a good thing, not a bug.

  5. Never... by macemoneta · · Score: 1

    Buying from Crucial and Kingston, and using proper anti-static handling, I've never had a bad memory. Knock on wood.

    --

    Can You Say Linux? I Knew That You Could.

  6. Never (ever) by Anonymous Coward · · Score: 0

    I worked at a local "mom & pop" PC store, and they used the cheapest they could get for new computers. Of the 100's of systems I built, I never got a bad stick of RAM - and I'd leave MemTest86 on overnight to be sure.

    Have you tried the ram in a different motherboard? Are you sure your MB can use the RAM you got? The newer NForce2 boards are sorta picky.

  7. Cheap ram = bad ram by pr0c · · Score: 5, Informative

    I do a lot of side work dealing with computer upgrades. I outright give 2 options:

    1.) We get cheap stuff and save you money. I make it very clear that it may not work
    2.) We get Normally priced ram and be sure its good

    Of the few people that did not want to spend the money to get a good brand even with me warning that its a bad idea about 1 in 3 ram chips did NOT work. I've NEVER had a good brand (crucial, kingston etc) fail even 1 time. I dont' gamble on my system I use Corsair XMS and thats what i recommend but anways thats what i've found.

    My Rough Stats:
    Cheap Memory 30%+ failed Good Memory 0% failed this is only dealing with about 100 experiances in the last few years, i don't do much side work.

    1. Re:Cheap ram = bad ram by GigsVT · · Score: 1

      I have similar experiences with failure rates, at least 2 out of 10 bad with cheap ram, with another 10% of the cheap ram developing problems later on. Don't use cheap RAM anymore, so this is all based on cheap RAM from about 2 years ago.

      I've also had 100% success with Crucial and good name RAM. One caveat I have to mention, we had serious trouble using Kingston RAM on a Tyan TigerMP. Memtest86 didn't turn up anything wrong, but the OS would hang within an hour of uptime. Never did quite figure that one out, and the RAM apparently wasn't bad, it worked fine in another system, but switching the RAM to Crucial fixed it.

      So, I guess the bottom line is pretty much the same as your post, but also might want to ask around or search around for the motherboard you are going to get to make sure it isn't allergic to any particular good brand name.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    2. Re:Cheap ram = bad ram by FueledByRamen · · Score: 1

      I agree with your comment about the Tyan board. My friend had an S2460 Tiger MP board that would ONLY accept a single 512mb RegDDR 266 stick from Crucial. (It probably would've taken a 1gb, but he was too poor at the time.) I told him that he should get rid of that board, but the problem solved itself - his power supply died and the resulting surge caught it on fire, burning its AGP slot, voltage regulators, northbridge, and few other miscellaneous chips into a ball of melted plastic and silicon. The RAM did survive, though, along with both CPUs and all expansion cards! Unfortunatly, the power supply was nice and melty, the CD burner burned its last, and his floppy drive took a hit for the team.

      --
      Every cloud has a silver lining (except for the mushroom shaped ones, which have a lining of Iridium & Strontium 90)
    3. Re:Cheap ram = bad ram by GigsVT · · Score: 1

      The TigerMP is hard on the power supplies and voltage regulators because it only has a single ATX power connector and not the supplimental connector that most MP boards have these days. The arrangement and location of the voltage regulators also makes it hard to get good airflow over them in a lot of cases. It's really a kinda poorly engineered board. I used to like it, but after working with several of them, they've been nothing but headaches.

      We have one that we can only run in single processor mode for some reason. Don't know if it is software or hardware, but it will reliably crash hard when you run ghostscript on a large file.

      I spent days trying to track down that problem, swapping RAM, CPU, motherboard, powersupply (2x400W redundant), adding more case fans, even cutting a hole in the case to put a new fan directly over the VRM/CPUs, using a HVAC meter to measure airflows, trying different kernels, and kernel options, etc. I gave up. That system is permanatly single processor now. Since we swapped everything out except the case, we wrote it off as a cursed case.

      Another identical system in an identical case, doesn't have the problem. Go figure.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
  8. Bad Memmory... by ewhenn · · Score: 1

    It usually pays to buy some higher quality RAM. I have used cheepo generic and higher quality RAM and have witnessed the difference. The more expensive top grade RAM is more stable and I ahve had less issues with bad modules.

    As a side thought, with ever increasing RAM densities/capacities the chance for error and bad modules up. It's just a fact of the manufacturing process. Make sure not to forget about that.

  9. It's all with the static. by jasamaman · · Score: 1

    I've only busted RAM chips when I didn't use anti-static protection. They were working, and then I carried them to another computer, and they were zapped in the process. So, make sure you wear your ESD straps and handle it safely!

    --
    Someone ever tries to kill you, you try to kill them right back!
  10. Avoid risk - use less memory. by zulux · · Score: 2, Interesting

    For a lot of the FreeBSD / Samba server that I use, I simply remove most of the memory. Less memory - less risk that the the system will run in to a bad batch. Don't remove so much that you end up trashing (trashing could expose errors in the bus or potentially over-strss your hard-drive) - but for normal (not high performance) file serving, nothing is gained by having huge quantities of memory.

    --

    Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

  11. I returned a stick 12 times to Fry's once by linzeal · · Score: 1

    And they gave me some high priced ram eventually heh. It was for my home entertainment PC it would freeze with only 128 megs when the PVR began recording works fine with this blue ram, forget name.

    1. Re:I returned a stick 12 times to Fry's once by Anonymous Coward · · Score: 0

      I would've given your money back and showed you to the door.

      Buy cheap shit, DO NOT expect high-end service. They lost money on you. Keep doing it, and you'll see places like Fry's dissappear.

  12. My experiences by FueledByRamen · · Score: 2, Interesting

    I've used all sorts of different RAM in many systems:

    My SGI Indy is using old 72-pin SIMMs that I found on my (carpeted, read: static-inducing) floor under my desk.
    My SparcStation 20 uses RAM that was shipped to me rattling about in a cardboard box without any packaging material whatsoever.
    My K6-2/400 used crappy no-name (not even a brand to be found on the chips themselves) 256mb PC100 DIMMS
    My Athlon TBird-1.4 used the cheapest no-name crap DDR RAM I could find on Pricewatch - 512mb of it - but at least Infineon's name was printed on the chips.
    My Athlon XP1800+ used just about the cheapest RAM possible (I bought it from NewEgg instead of some other vendor for about $2 more) with no names on the modules or chips.
    A Dual Athlon MP2000+ server I built uses no-name Reg. DDR266 / 512mb from a Pricewatch low-baller.
    My current P4/2.4 uses 256mb DDR266 by OCZ Systems, and not because of the brand recognition (who the hell are they?) - it was super-cheap. I bet if I removed the RAMsink on it, the chips would be nameless.

    None of these systems have ever had memory problems. They rarely, if ever, crash (or at least they didn't crash when I had them - some have passed on into the hands of friends). Maybe I'm just one really lucky bastard when it comes to RAM, but I've never had any problems buying the cheapest shit memory so that I could save a few bucks.

    Also, somehow, I have managed never to kill a component with static electricity. The worst that's happened is that I rebooted my Atari 800 by zapping it right on the motherboard while it was running. In fact, the only component that I've ever bought or installed that didn't work was a Sun Creator3d framebuffer - the only component I've ever used an anti-static wristband to install (because there was a free one in the box) and it was DOA with big vertical lines running through the picture at regular intervals (4 pixels). Well, that and two Fibre Channel drives that exploded, but that's because I was hot-swapping them and shoved the power connector into the (worn-down, self-installed) receptacle backwards.

    --
    Every cloud has a silver lining (except for the mushroom shaped ones, which have a lining of Iridium & Strontium 90)
    1. Re:My experiences by photon317 · · Score: 4, Insightful


      You've been lucky on RAM for sure. Now about this static discharge thing. I also never used to use wriststraps or any other static precaution working on home stuff. I always did it right at work because it was required, but at home I routinely did just about anything I could to static damage them because I knew it was unlikely to cause a problem. My experience was always that the components worked fine anyways, and that ESD damage must be such a low occurence that you're just not likely to ever see it so it's not worth the trouble.

      However, later on down the line I learned the error of my ways. I was failing to understand the nature of ESD damage. Someone finally clued me in. In short, ESD damage *does* happen with a surprisingly high frequency when you handle components unsafely, but you don't notice because the damage takes time to show. Essentially the high voltage of the ESD (ESD like when you shock yourself on a doorknob is very high voltage, it's just very low current) is destructive to the transistor junctions, but it usually doesn't cause immediate complete failure. A few days, months, or even years down the road, the junction will prematurely break down, having had a shortened lifespan because of the ESD damage.

      So those components that failed on you after a few good years of service that you chalked up to just failing from age probably failed to a large degree from ESD back when you first installed them, and had you used the right precautions, they might've lasted a lot longer. Now that I understand this, I'm a lot more careful about ESD even at home. From what I read, the long-term effects of ESD over a large sample are better felt by electronics companies. They can actually see the warranty return rate on their chips drop consistently when they put better ESD precautions into place, although it may take a few years to see.

      --
      11*43+456^2
    2. Re:My experiences by Spoing · · Score: 4, Informative
      [ Slash long list of systems ]

      1. None of these systems have ever had memory problems. They rarely, if ever, crash (or at least they didn't crash when I had them - some have passed on into the hands of friends). Maybe I'm just one really lucky bastard when it comes to RAM, but I've never had any problems buying the cheapest shit memory so that I could save a few bucks.

      Out of sight, out of mind.

      Being a former test lead for a memory diagnostic tool, I'd bet you had plenty of memory errors. When they occured, they didn't 'look' like memory errors, so you treated a different problem. Your fix 'worked', so you claimed sucess and moved on. Other errors might not have symptoms -- even if corrupton did occur -- so you didn't notice anything was wrong.

      1. Basic example: One bit errors let alone other more complex defects can pass hardware parity checks (change a bit here and it flips a bit in a physically similar area).

      The stats given by others -- ~30% failure on cheap memory and 0% on good within the first month -- are close to my experiences. IMNSHO, the intial numbers are the same (~30% & ~0%). Over the lifetime of a system +10% of both cheap and good memory tends to fail (or get wrecked by bad power).

      To catch the +10% failure rate on non-ECC memory, and to catch memory subsystem errors in general, I run extensive tests on systems that can be taken down about once a year -- this is beyond any tests to diagnose flaky behavior.

      Memtest86: It is excellent and as good as any other memory diagnostic software I've ever used when running all tests. As a matter of course, I add memtest86 to the boot menu on all x86 systems.

      BIOS memory tests: The boot up memory tests are useful only to identify that the memory exists, so if possible I turn them off.

      --
      A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
    3. Re:My experiences by irix · · Score: 3, Informative

      ESD damage *does* happen with a surprisingly high frequency when you handle components unsafely, but you don't notice because the damage takes time to show

      I used to work at a semiconductor manufacturing facility once upon a time. Let me just say that this is 100% correct.

      My employer spent a lot of money on ESD prevention because ESD errors were the worst kind of errors. Sometimes the chip would fail catastrophically, but usually it would pass probe and test and get shipped, only to fail prematurely in the field (latent failure). This is obviously much more expensive than finding the problem before the device ships.

      Another common misconception is that you need to feel the ESD charge - like walking across a carpet in sock feet and touching a doorknob - in order for damage to occur. This is false - most electronic components can be damaged at a much smaller voltage than you can feel in your body.

      My best advice is that simple ESD precautions like a wriststrap are cheap, so use them.

      --

      Do you even know anything about perl? -- AC Replying to Tom Christiansen post.
  13. no bad expieriences at all... by the_real_tigga · · Score: 1

    to the contrary to most of the above posters, I have always been buying the chepest RAM I could get, along with second-hand(!) brand-name RAM, and I have never ever had a single bad RAM chip since my 386 days.
    I always hadle my computer parts with extreme (some say paranoid) care for 'static I must add.

    Save one occasion when I mixed four different kinds of SDRAM in a AMD K6/2 based system. And I talk different timing and clock specs, different number of chip on each RAM bar. I got segfaults and bluescreen al over, but removing _either_ bar fixed it.

    --
    my .sig is better than yours.
  14. Run Memtest86... by (H)elix1 · · Score: 4, Informative

    Memtest86 will go a long way to test the ram. If you are going through tons of wanky ram, the issue may be your cpu or power supply however. Test the ram on a couple boxes.

    As for no-name. Usually grade 'a' ram will run at a lower cas rating, where some of the generics might work at a higher (and slower) setting. Stuff that rates at PC-100 CAS 2 might only work at PC-133 CAS 3. (dang, showing my age) The good stuff tended to be able to run stable at the faster FSB and CAS settings. My time is worth more than the ~$30 bucks between solid and guesswork.

    If your not pushing a system hard - cheap ram might just work. A few years back a local vendor had some dirt cheap no-name 128M sticks that ran as fast as my mushkin stuff. Go figure. You role the dice, but it matters less if you are not pushing your settings hard.

  15. never had bad memory by Anonymous Coward · · Score: 0

    I've build five machines (for myself and friends) and I've never had a bad stick of RAM. Repeat - NEVER. That could be because I usually just buy from Crucial and they're a premium brand.

    Plus I always run with surge supressors on all power lines and equipment (including the phone line that carries the DSL signal), and a UPS on my main computer. This must have surely helped. I actually know of one case where a DSL modem was fried by an electric clothes drying machine with a ground fault (my friend was always kicked offline and the modem reset whenever he did laundry - it wasn't until it permanently failed and had to get it replaced that he figured out what the hell was going on).

    The oddest memory issue I've ever had was when I was upgrading my old dual Pentium 233 about five years ago - it would only boot with the RAM in a specific configuration - swap the sticks around and it worked (smaller sticks in lower numbered banks), switch em back (larger sticks in the lower numbered banks) and it refuses to boot. Once I got that memory working, it's still going strong to this day (the machine is now running FreeBSD). I believe it had something to do with slightly different memory timings, but proves that sometimes you just need to try things just a little differently to get them to work.

  16. Bad engineering, bad commerce. by dpbsmith · · Score: 4, Insightful

    Why does the computer industry tolerate this sort of thing? When it was hobbyists tinkering with Northstars and Cromemcos and Sols it might have been understandable, but we should have grown up a long time ago.

    When you put oil into your car, you know that the oil companies and the car companies have gotten together with the American Petroleum Institute to set standards so that as long as your owner's manual says "API SG" and the oil you buy says "API SG" or better, that oil will work in your car. And you can use Mobil Oil to top up an engine filled with Quaker State without losing any sleep over whether their chemistry is compatible.

    You don't rely on friends' stories of whether Quaker State is better than Shell Oil. You know that regardless of the price of the oil, if it says API SG it meets API SG specs and if your car says API SG specs are good enough, they're good enough.

    It doesn't benefit anyone if your engine seizes up, and it doesn't benefit anyone if your computer crashes.

    It's simple, it's easy, millions of consumers who aren't chemical engineers buy engine oil every day without wrecking their cars.

    Why is it expecting too much for computer vendors to do the same?

    And, while we're at it, why don't all computers use parity-checked memory? This was standard on 100% of all computers before the micro age, and for some reason people started putting in non-parity memory to save money and asserting that "it works."

    And our computers crash a lot, and nobody knows why and nobody does anything about it and everyone just accepts that that's the way computers are...

    1. Re:Bad engineering, bad commerce. by wik · · Score: 1

      Part of the problem is that it takes forever to create and approve a standard. Technology such as memory is a lot more complex and changes a lot more frequently than motor oil. Any standard that you might get would be years behind the current technology, and thus, useless.

      --
      / \
      \ / ASCII ribbon campaign for peace
      x
      / \
    2. Re:Bad engineering, bad commerce. by moncyb · · Score: 1

      And our computers crash a lot, and nobody knows why and nobody does anything about it and everyone just accepts that that's the way computers are...

      I doubt your issues are with bad/incompatible RAM, or even hardware. With my experience, the vast majority of today's system/application crashes are due to highly buggy software. Mostly from one specific company.

      My system doesn't crash a lot, and I have a crappy buggy VIA MB. Yeah, the thing would lock up under heavy disk load, but I found out it was a bug with how their ISA interface works with sound cards. Now I use a PCI sound card. Problem gone. Sometimes a program will crash under hard 100% processor usage for an extended time (like 30 minutes). A better power supply, and tweaking the PCI BIOS settings in my kernel have mostly fixed that problem. If I use 'nice make -j 1', then I can get through a kernel compile. I raytraced a Povray animation yesterday, and no crash. My current computer can be considered punishment for buying cheap and used without checking things out. If I would have investigated, I probably would not have bought it.

      The only crashes I have had in the past entire month were a) Mozilla -- it's bloated and buggy, so kind of expected. b) mplayer -- screwed up the entire system because I forgot to set the -vo setting, and it chose the wrong one. The Mozilla problem is no worries. I use a decent operating system, so I don't have to think about rebooting. The mplayer problem was due to a mistake on my part--the program acesses video hardware directly. A screw up there will paralize anything. I have since made a mplayer.conf file.

      The moral of the story is: don't use ultracheap crappy hardware, but the main cause of today's computer woes--crashes several time a day--is the fact many people insist upon using the crappiest, most poorly designed operating system with the crappiest, most poorly designed applications. From what I've seen, hardware is just as reliable or even more reliable than it was ten or twenty years ago, but software has gone the other direction.

    3. Re:Bad engineering, bad commerce. by Anonymous Coward · · Score: 0

      You are so correct. I mean JDEC is only there to give the reps from the memory manufacturers some time off for a free conference. They don't do any standards or real work...

    4. Re:Bad engineering, bad commerce. by ecalkin · · Score: 1

      parity checked memory that was 8 bits data one bit parity was common when memory was not quite the quality that it is today. even back them people complained that ibm made things *more* likely to fail because of the extra bit.

      as memory become better, most errors were detected at post and the extra bit really wasn't needed.

      it not only makes the memory cheaper, but also the memory management stuff of the system board gets cheaper and simpler.

      eric

  17. When iquanas are at stake ... by abulafia · · Score: 0, Offtopic

    I have run into bad marketers a few times, I quit visioning with the cheap stuff and only deal with Zoink, Inc. - have had excellent luck with them.

    --
    I forget what 8 was for.
  18. No context, so take with grain of salt by Jerf · · Score: 1

    You didn't give us a lot of context, so please don't take this personally if you've already checked into this.

    If you keep buying a component that repeatedly fails, it's worth triple-checking to make sure that whatever you're plugging the component into is working. Have you taken known-good RAM and plugged it into the motherboard to make sure that works?

    There is an art to learning when not to do this, because the component you are plugging into is actively frying the pluggable component, but in general this is pretty safe and often necessary.

  19. Summary by chriso11 · · Score: 2, Informative

    Ok - to summarize

    1) whenever you buy a new stick of RAM, run memtest 86 on it for an hour or so. It can save you weeks of problems.

    2) Use a grounding strap. ESD damage is a serious problem, and especially in the winter months, can easily lead to zapped parts. In fact, use a strap whenever you open your box! I even have a roll-up ESD mat for serious surgery.

    I have actually had memory go bad in my PC right when I was using the PC: it was good one minute, then bad the next. I have a nice APC UPS working as a surge protector. THe memory was some premium stuff too - Corsiar XMS memory. I hadn't touched the inside of the box for a few weeks (hard to believe, huh?), and I was practicing with the 203 on America's Army, and I suddenly got a win2k BSOD (which has a lot more words, but is basically just as useless as the win98 BSOD). So:

    3) test your memory periodically - like every 6 months or so.

    4) Maybe your motherboard has some debris in the memory slot or a sliver of metal shorting some pins out.

    --
    No, I don't trust in god. He'll have to pay up front, like everybody else.
    1. Re:Summary by Spoing · · Score: 1
      Comments:

      Details on what I've seen and recommend are here.

      1. An hour long test only catches the most basic errors. You sleep, right? Run all tests and use that time to check it out. I've run memtest86 for a day and a half at times...and only toward the end does an error get detected -- sometimes not caused by the RAM but the memory subsystem (toss the board).

      2. See above.

      3. Yep. I only do it 1x a year or when something looks wrong or just strange. The more often the better.

      4. Yep. Along those lines, a gentle shake test and compressed air has saved me many times; "So that's where that screw went!". (Yes, compressed air can be a bad thing, though I've found that it is worth the risk.)

      --
      A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
    2. Re:Summary by mcgroarty · · Score: 1
      whenever you buy a new stick of RAM, run memtest 86 on it for an hour or so. It can save you weeks of problems.

      Hee. An hour isn't much with memtest86 unless you're testing a 128mb 800mhz RIMM. :-)

      Run it until ALL of the tests have completed at least once with the CPU cache disabled. Unfortunately, this does take quite a while. But give a day to it. At today's densities, RAM is pretty easy to damage, and it's nice to be sure.

    3. Re:Summary by chriso11 · · Score: 1

      I generally run the default selection of tests, and I can run a complete iteration in ~20minutes, and I have a 512MB of DDR on an Nforce2 MB. So I can almost get 3 complete tests done in an hour.

      When I've had bad memory, the first iteration always flagged it. Of course, a more complete overnight run would be better. But, in my experience the first 30minutes finds 90% of all memory errors.

      --
      No, I don't trust in god. He'll have to pay up front, like everybody else.
  20. Bad sockets? by mcgroarty · · Score: 1
    Many memory sockets aren't made to spec, and end up not letting wider (Micron, Hitachi) memory sticks or narrower (PNY) memory stick seat properly.

    Everyone + dog had problems with the ABit KG7 not letting Micron memory work, but working wonderfully well with cheap nobody-brand sticks. The suggested solution is usually "ebay your memory, remove the board and put it on a hard surface to install your sticks, or be prepared to flex the hell out of your board and pray nothing breaks."

    1. Re:Bad sockets? by kidlinux · · Score: 2, Funny

      I have two Abit KG7-raid motherboards, using Crucial registered pc2100 ddr memory, no problems at all.

      You're right about the flexing though - only if you've not installed the motherboard correctly. I did this once. There's a screw mount near the ram slots that I overlooked, and if it's not there the motherboard will flex right down to the case's motherboard mounting plane. This being my first experience with an ATX power supply (ie: ones that aren't actually off when the computer is shut down, and have a manual power switch at the back), I didn't manually turn off the power or even unplug the cord (duh!) So when I pressed the ram in, my mobo flexed and shorted out on the case - huge sparks and all sorts of wonderful language and a dead motherboard (ram was perfectly fine though!) Lucky for me I managed to get the mobo replaced on warantee :)

      Anyway, moral of the story is - install the motherboard properly and you won't have this "flexing" problem. Look for the screw mounting hole near the ram slots and make sure you put a mounting peg in there.

      --
      -kidlinux.
    2. Re:Bad sockets? by Anonymous Coward · · Score: 0
      Anyway, moral of the story is - install the motherboard properly and you won't have this "flexing" problem. Look for the screw mounting hole near the ram slots and make sure you put a mounting peg in there.

      Actually the moral of the story is the warranty replacement got you a late revision ABit board which had the hole added and the clips changed out. If one of your KG7-RAID boards is the earlier revision, you'll notice a darker color plastic. If the other has a screw hole near the RAM, then it's late revision too. :-)

  21. memtest86 not a good test by Anonymous Coward · · Score: 0

    I have a number of sticks of memory that will pass memtest86 running all night just fine. The test that uncovers the problem is to compile the linux kernel, then gcc, then glibc. If it makes it through those, then the memory is good.

    To do this, before I install an os on the machine, but when I have all the parts, I boot from knoppix, make an ext2 partition on the drive, and download the tarballs and config/make away. You can't make install of course. But the test is just to see if the compilation finishes.

    Most of the computers I am doing this are older boards using PC100 memory. I find that a lot of the PC100 memory out there fails this test, but will pass it if you jumper the memory bus to 95 MHz instead of 100.

    Fry's uses memtest86 to test memory brought back to them before they put the sticker back on it and shove it back on the shelf like it was new. This should be a good indication to you (if you have ever used a lot of Fry's cheapest RAM) that memtest86 is not strenuous enough.

    1. Re:memtest86 not a good test by mcgroarty · · Score: 1
      I have a number of sticks of memory that will pass memtest86 running all night just fine. The test that uncovers the problem is to compile the linux kernel, then gcc, then glibc. If it makes it through those, then the memory is good.

      Erm, there are actually a few programs out there called "memtest86." I promise you that if you're using this one then if that stick fails consistently (erratic RAM failure is pretty much unheard of outside of unusual operating environments) then memtest86 will find the problem.

      This is the same program that Crucial uses to determine whether they should resell memory that's been sent back as bad.

  22. tHrashing, not trashing by Anonymous Coward · · Score: 0
    1. Re:tHrashing, not trashing by Anonymous Coward · · Score: 0

      No, it's called "setting the present bit to zero in the LRU page's PTE, and migrating said page out of core on to a persistent block-oriented storage device", not thrashing.

      Damn. Bunch of Phillistines in here.

    2. Re:tHrashing, not trashing by darqchild · · Score: 1

      Thrashing and Trashing may be used synonymously here.

      --
      What? Me? Worry?
  23. Challenge: Gentoo by Anonymous Coward · · Score: 0

    I double fucking dare you to kick off a gentoo install on one of those boxes. It won't make it.

  24. Common sense... by Phleg · · Score: 1

    If you're going to be needing registered RAM for the system, for god's sakes don't buy low-quality parts. Getting a Chaintech motherboard with registered DDR RAM is kind of like buying a souped-up Geo Metro. It's not a good idea.

    --
    No comment.
  25. Never happened to me by Descartes · · Score: 2, Funny

    I don't buy ram very frequently but I have never run into bad ram, and I always buy the cheapest I can find.

    My one encounter with "bad" ram was in a computer hardware class I took a few years ago. Two other classmates and myself were usually given special tasks by the professor because the class was so stupidly easy for us. One day, after we finished our two hour lab in fifteen minutes he gave us a stack of 8meg simms (this was a while ago) to test with some software he had. We tested about six and every time one or all of them came up as bad. Being 18 year old computer geeks we decided could keep the bad ram for keychains. The next week he told us to try the test with known good ram, and it turned out that the program was faulty not the ram. My fully functional $30 keychain has since fallen apart but I sometimes wonder if he every counted the ram in the closet at the end of the semester.

  26. Motherboard? by MrResistor · · Score: 2, Insightful

    At no point did you say that you've verified that your motherboard is good. If you keep swapping out RAM and all of it seems to be bad, I've got news for you: it's not the RAM that's bad.

    --
    Under capitalism man exploits man. Under communism it's the other way around.
    1. Re:Motherboard? by Anonymous Coward · · Score: 1, Insightful

      In short, don't assume that "Memtest86" errors mean you have bad RAM. It could well be a defective motherboard, or more rarely, a bad CPU.

    2. Re:Motherboard? by MrResistor · · Score: 1

      Absolutely correct. Let me add, though, that Memtest86 is still an excellent diagnostic tool, just be prepared to swap out some hardware in the process.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
  27. With cheap RAM, very common by duffbeer703 · · Score: 1

    There is a reason it's cheap.

    Expensive name brand stuff is usually perfect. No-name computer show memory often as all sorts of flaws that redundancy in the chip takes care of.

    --
    Conformity is the jailer of freedom and enemy of growth. -JFK
  28. Chaintech by droyad · · Score: 1

    Chaintech 7KDD

    There's your problem right away. Shitty motherboard manufacturer. Buy if your going ECC Ram, at least get a respectable motherboard like Asus or Intel.

    Intel has a list of tested and approved RAM on it's web site and it will guarantee that they work together.

    The cheap motherboards and RAM are cheap because they have not gone through rigourous testing to make sure that they are ok. It's the same with brand vs no-brand stuff. Often the same manufactur makes the components, but because the branded stuff (ie Kingston vs OEM) is tested and that costs a bit extra.

    Buy cheap crap (I know you paid a lot for it, but it was ECC) and you will get burn eventually.

  29. Bad experience here too. by mnmn · · Score: 1

    I assemble lowcost computers for customers using the ECS Duron Motherboard combos. Some days ago they started releasing ECS motherboards capable of taking DDR memory, so I asked the guy how much for upgrading the memory from SDRAM to DDR, he said same price..

    The place was Sonnam Computers Toronto, College and Spadina, one of the lowest cost places Ive known in Toronto... So I got the chips, plug in and it works fine.. I install Windows 98, utilities, antivirus etc, works perfect.. now as I am playing Unreal tournament to test the stability, it crashes.. the memory was two sticks of 128MB DDR..

    I reboot, now the BIOS registers 128MB ram. Got into windows, shows 128.. I power up unreal again and crashes.. Reboot. Now shows 64!! . Booted into Knoppix and ran memtest.. shows 64. Some time later this became 32. I took it back to the place, they thought its the power supply, so we changed both the power and the ram chips. Same thing happens next day. Next I just exchanged the DDR for plain old SDRAM and it works perfect. The ECS board has Sis730 chipset, which from tomshardware anandtech etc seems robust enough. I suppose they just dont make DDR memory the way memory is supposed to be made.

    --
    "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
    1. Re:Bad experience here too. by Spoing · · Score: 1

      Sounds like a defect in the system board, maybe the BIOS. Memory shouldn't vanish, even if bad.

      --
      A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
    2. Re:Bad experience here too. by EvilStein · · Score: 1

      I've heard of that happening in the earlier versions of the ECS K7S5A board.. later revs were ok.

      I have quite a few of those boards. Haven't had one single problem with them. :)

  30. Bad Chips? by JM+Apocalypse · · Score: 1

    On my server, I have some bad RAM. But, it is highly odd how it is reacting. The server fails to show 512MB of RAM, but instead is missing 32 MB. Don't ask me where it went, but it works. And that's all that matters.

    Is it possible for only one chip to fail? Of course, this was probably the cheapest possible RAM I could buy.

    --

    - - - - - - -
    Orppf urp mf y.ppcxn. yflcbi otcnnov C am yflcbi yr n.apb Ekrpatv (Dvorak -> Qwerty)
    1. Re:Bad Chips? by wmaddox · · Score: 1

      Do you have on-board video? If you have a UMA framebuffer, that's probably where the memory went.

    2. Re:Bad Chips? by hashwolf · · Score: 1

      Erm.... a year ago I had a similar problem. It turned out that it was my brain that was not working properly: If you choose 32Mb for AGP (from BIOS) that's where the missing 32Mb goes. Now I know better.

      --
      - "They misunderestimated me."
  31. BIOS limitation by moncyb · · Score: 1

    About your dual Pentium...it sounds like your BIOS. Until a few years ago, many BIOSs were not designed for all combinations of memory size. (When you say "larger sticks", I assume you mean larger memory capacity.) In those types of motherboards, your memory has to be placed in certain orders according to how much memory the stick contains. Some of them were so restrictive, they would only support the same sizes. If you had one 8MB stick, your only choice was to use another 8MB one. A 16MB+8MB combo wouldn't work.

    I'm sure if you look in your MB manual, you'll find a section telling you which combinations work...

    1. Re:BIOS limitation by Anonymous Coward · · Score: 0

      Actually, this was a limitation for FPM and EDO 72 Pin SIMMs on a pentium North/Southbridge. The sticks were interleaved and installed in pairs, so a 16MB simm could not be installed in conjuction with an 8MB simm. A pair of 16's and a pair of 8's were legal, as long as they both were close to the same specs. All were FPM or EDO, all were 60 - 70 ns, etc.

      Then came the DIMM. I don't remember those having to be installed in pairs, but on some systems it could possibly have been a requirement for the North/Southbridge chipsets, although unlikely as DIMMS use a completely different addressing / refresh than SIMMS.

    2. Re:BIOS limitation by TeddyR · · Score: 1

      Most high end server boards require that the memory be ECC registered that needs to be installed in pairs...

      an example of such a board is the Intel SCB2.

      [that board can go up to 6GB ram and has 6 sockets]

      --

      --
      Time is on my side
  32. Crucial is boring...and I like it by Bourbon+Man · · Score: 2, Informative

    I'm responsible for 250+ PC's and a dozen servers. Over the last couple of years I have bought literaly hundreds of sticks from Crucial. Never a single bad chip, never a compatability issue, never any problem whatsoever. Period.

    1. Re:Crucial is boring...and I like it by Anonymous Coward · · Score: 0

      I must have complete bad luck then. I'm 0 for 1 with Crucial memory (unrecoverable ECC errors after about 6 months), and I'm something like 100 for 100 with Kingston.

  33. +3 interesting? by Inoshiro · · Score: 1

    This is as interesting as someone bragging they like to have sex bareback, with the only bad experience being some time they tried a condom and got crabs.

    There's no reason to mod dumb luck up, only to tell FueledByRamen to go buy lottery tickets.

    --
    --
    Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
  34. Don't be a cheapskate by Anonymous Coward · · Score: 0

    That will solve many peoples' RAM issues. All to often people buy the cheapest crap RAM they can find. They get what they pay for. Never buy the housebrand. All too often you don't get what was advertised. Stick with a quality namebrand with a lifetime garuntee. Personally I like Kingston. Crucial, Corsair, Infineon, Zeus (Kingston really?), Viking, Kensington (for the specialty stuff they make), NewerRAM (excellent Mac RAM), and Viking. I don't trust anything else. Oh, and whatever you do. DO NOT buy your RAM (or anything else for that matter) from OCSystem. You absolutely WILL NOT get what you paid for. Their reviews over at resellerratings.com tell many sad stories. Beware and don't be a cheapskate.

  35. Bad RAM by FuzzyDaddy · · Score: 1
    My first job out of school was for an EPROM manufacturer. I was a product engineer. Part of my job was to diagnose failures that came back from the field.

    Now, all of the easy diagnosis (bad memory cell, row or column) were handled by QA. The really bizarre failures came to me. Glitches that occured only after the chip was in standby for more than four seconds, errors that only occured if the addresses were accessed in a certain order, not to mention marginal voltage and temperature performance...

    There are an infinite number of bizarre and subtle ways a memory chip can misbehave. It was a fun job, but I can never look at any sort of chip in the same way again.

    --
    It's not wasting time, I'm educating myself.
  36. Comment removed by account_deleted · · Score: 2, Informative

    Comment removed based on user account deletion

  37. Depends the lot and where you get it by xtal · · Score: 1

    10 years of computer buying, modern anyhow. Not a single cheap ass stick of generic memory has ever given me a problem - I've had more problems with brand name memory that's been from questionable sources. A good distributor should not sell you defective memory off the shelf.

    When in doubt, test.

    --
    ..don't panic
  38. ESD strips are not required by xtal · · Score: 2, Interesting

    You can take precautions without a ESD strip. Unless I'm working on raw chips, or very, very expensive pieces of equipment, it's not worth the hassle. Nobody is going to bother, so here are some much easier to follow words of wisdom:

    ESD advice for system self installers:

    - Ground yourself to the metal chassis or something comparable on your system before you start assembling things. Do this frequently.

    - Leave things in the anti-static bags until you're ready to put them together.

    - Don't handle ram chips by the pins. Handle the modules by the edges of the package. If you can't get them into the system like this, then move the system so you can. This advice is good for motherboards, hard drives, system cards - handle them by their edges only, not on the pins or where the socket connector is.

    - Never handle a cpu by the pins. Ever.

    Taking those basic precautions will get you a long life and few problems without the hassle of wondering where a ESD strap is. Memtest is your friend, and use a good power supply. If you need to ask, odds are you don't have one.

    --
    ..don't panic
  39. CMOS Electronics Primer - ESD Damage by BigBlockMopar · · Score: 2, Informative

    Essentially the high voltage of the ESD (ESD like when you shock yourself on a doorknob is very high voltage, it's just very low current) is destructive to the transistor junctions, but it usually doesn't cause immediate complete failure. A few days, months, or even years down the road, the junction will prematurely break down, having had a shortened lifespan because of the ESD damage.

    Indeed.

    Memory chips - and most other components within any computer less than fifteen years old - use CMOS logic. CMOS stands for "Complementary Metal Oxide Semiconductor", which essentially means that they're full of MOSFETs ("Metal Oxide Semiconductor Field Effect Transistor"). This includes almost all processors, support logic, etc. In fact, the only exception which comes to mind is the really old computers which had the big banks of 74xx-series TTL logic all over the place, like in an XT. But keep in mind that the processor itself - and many other components - will be CMOS.

    The neat thing about Field Effect Transistors is that the electric field created by applying a gate voltage turns on the source-drain circuit. There is essentially no current required to drive the gate. The fact that there is theoretically no gate current means that you can do things like power 20 million transistors off a single 200W AT power supply, or build a wristwatch which runs for 5 years off the same tiny little battery.

    The "field effect" is governed by the inverse square law. As you double the distance, you need 4 times the voltage to achieve the same field inside the source-drain junction. Naturally, in order to be able to work at the low voltages inside a computer, the distance therefore must be tiny.

    This tiny distance is filled with a layer of what is, essentially, glass. And it's so thin that it can have a hole blasted through it by 30 volts.

    Now, air doesn't ionize until about 3kV per millimeter. That means, to jump a 1mm gap, you need about 3,000 volts, which you perceive as a tiny static electric spark.

    You will never see, nor feel, a 30V static electric charge. You can build it up just by sitting in your chair. And that's enough to blow a MOSFET transistor.

    If a RAM chip has a million MOSFETs (modern ones have a lot more!) and you blow one of them, your chip is still well over 99.999% fine... until you try to read back data from the address with the blown MOSFET. And then you get one bit of garbage.

    The data in RAM is corrupt. What if it's executable? Does the machine crash? Probably. What if it's a JPG? Maybe one pixel on that 1024x768 pr0n image you downloaded is one shade of skin-tone different than it should be.

    A lot of ESD failures show up as intermittent crashes and other software problems. Before you reinstall your operating system because it's getting crufty, consider your hardware... well, unless you're running Windows.

    ALWAYS wear a wrist strap. It's a bummer, but them's the dice.

    --
    Fire and Meat. Yummy.
  40. Mushkin is the Fry's Blue RAM by Anonymous Coward · · Score: 0

    Price has come down some, but is still the premium RAM along with Corsair at Fry's.



    There actually is a reason why HP, SUN, etc will charge $1300 for 512MB of RAM -guaranteed interoperability. I've bought lotso cheep Fry's RAM and they often don't all play nice together between different speeds, sizes, geometries? etc.



    Not somthing you want to mess with on a production server.

  41. Crucial, Corsair, etc. by Anonymous+Brave+Guy · · Score: 1

    Crucial aren't perfect -- we've had the odd dodgy thing from them at work -- but their service is excellent and their quality is generally high, so I'd recommend them without hesitation. They're well worth the small price premium over no-name brands without the same quality, compatibility guarantee, etc, IMHO.

    There are other makes, Corsair for example, which are claimed to have even higher specs. You do pay a lot for that tiny bit extra, though. Unless you're heavily into overclocking (in which case I'm sorry, but you have to accept any problems you get) it's hardly worth spending nearly twice as much to get kit like Corsair rather than a good, solid bet like Crucial.

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  42. Just went thru this, lost a year! by AkkarAnadyr · · Score: 1

    I wanted a Mandrake system for self-ed/SOHO use, so got an ASUS Duron all-onboard mobo, cheap 512MB SDRAM, and a series of 4hr stints on weekends to work on it (w/2 kids, s'all you get, home's). W98 would install and BSOD, Mandrake install wouldn't even start - memtest86 duly barfed, and I exchanged it.

    Second stick survived a couple of 4-6hr runs on memtest86 w/no problems. W98 installed, BSOD'd little more than usual, but I considered it secondary anyway. Mandrake install now would run, but kept dying mysteriously during unpacking (CDROM read problems). Swapped CDROM - still choked at some point before network cfg, died on text screen w/hex spew from kernel.

    This has now taken a couple of months of Mandrake half-installs, 2-3 hours per week. New FDD didn't help. New HDD didn't help. Caved eventually and bought an ECS Athlon mobo, figuring I was out of the woods with (virtually) all new components - after all, the RAM had been replaced and tested good, right? Just to be sure, I reran memtest86 on the new board - all OK after multiple runs in 6 hours.

    Mandrake 8.0 STILL dying late in the install, or not starting properly if it did finish after an install section would barf. OBSD 2.8 wouldn't install either. Weekends now gave way to other duties, like the yard; I'm an embedded systems guy, I don't build stock boxes all day, and at some point the novelty lost its shine.

    Eventually a friend mentioned that he'd traced these kinds of gremlins to faulty RAM, so I looked for the top grade stuff and dropped a new stick in. Bingo - I could finally get KDE and start trying to remember what the fsck I wanted the box for in the first place, and just in time for the Mandrake Club membership that came w/the PowerPack to expire. Xemacs won't quite run yet; the holidays and tax time intervened, and Solaris at work doesn't have urpmi to learn.

    Moral: Science works, memtest86 has limits, and when all possibilities have been exhausted, whatever remains, however unlikely, must be the solution. Buy top-drawer RAM, your time is worth more than the extra nickels.

    --

    I bought this house and you know I'm boss
    Ain't no h'aint gonna run me off