Slashdot Mirror


Firms Get Away with Selling Untested DRAM

peppytech75 writes "Melanie Hollands in IT Manager's Journal reports that 'In recent months, some Asian DRAM memory manufacturers have been getting away with selling untested ("UTT") DRAMs. Disturbingly, the practice seems to be getting traction at the lower portion of the module business. This is being done mostly by Taiwanese DRAM makers, who are undercutting the tier-1 guys by selling untested and unmarked parts.' What's the solution here? Or is there an actual solution to what amounts to pirate companies issuing counterfeit parts?" (IT Manager's Journal, like Slashdot, is part of OSTG.)

12 of 344 comments (clear)

  1. If you're stuck with one of these... by Anonymous Coward · · Score: 5, Informative

    Here is the obligitory Memtest86 post. It's a great program, and chances are that you might already have a copy on your Linux install CD depending on the distro. There are even kernel patches that allow you to avoid the bad bits if they are isolated enough.

    1. Re:If you're stuck with one of these... by Anonymous Coward · · Score: 5, Informative

      Although Memtest86 is absolutely great for detecting memory errors, I perfer Memtest86+.

      It's a more updated version of Memtest86 (which was last updated in November!), from the x86-secret team. It'll do the same thing, just that it will identify all the new procs and chipsets better.

      http://www.memtest.org/

      PS: I find if the RAM has any errors, the Modulo-20 test will nail them. Methinks it's test number 11 in Memtest86+.

    2. Re:If you're stuck with one of these... by zakezuke · · Score: 4, Informative

      I'm curious why Linux has issues with this... I had bad RAM for a while and didn't even know it running windows. It installed, and ran just fine for weeks. Installed Linux, and Redhat wouldn't even finish the install.. suse installed but then crashed at random times... etc.

      Was windows just getting lucky, or what?


      Are you sure it's a RAM issue. I found Redhat, and other distros hard to install when I had my old HP 2x burner. But when I upgraded to my DVD burner, the problems for the most part disappeared. It was as if the drives I was using didn't like the discs I burned, yet windows had no problem what so ever. I could install from my backup discs, never as much as an error making images, the evidence would suggest it made solid discs. To this day it remains a mystery to me, the fact that those discs still had the same problem, but if I copy those files to a HD from the very same discs, no problem.

      Another example, I thought I had a bad batch of ram. Tested bad, random reboots after being on for a while, crashing with CPU / memory intensive tasks. Drive me absolutely batty till I swapped out motherboard and the problems disappeared, and when I put in a lower speed chip in the same board, the problems also disappeared. I can only assume based on this evidence that the board in question didn't like running at 166mhz despite the fact that both are based on the same chipset, save the smaller north bridge heat sync.

      --
      There is no sanctuary. There is no sanctuary. SHUT UP! There is no shut up. There is no shut up.
    3. Re:If you're stuck with one of these... by gravygraphics · · Score: 5, Informative

      Memtest86 finds really bad ram, not good ram. Without having knowledge of how each chip is internally arranged, access to the chip's test modes and the ability to control the temperature, there is no way to finish testing a modern DRAM in our lifetime.

      Just take for example, the internal layout. If you had a 512M chip and you didn't know which cells were adjacent, you would have to write a single bit and read from every other word. We are talking x cells * y reads (*2 for writes). If you read 8 I/O's in parallel (remember I am talking about a chip, not a module) than we have 512M cells * (512/8)*2 = 7.2*10^16 OR 72 megagiga operations. Assuming you can keep about 200MHz worth of useful read/writes (remember most addresses aren't in the same page)than we are talking something like 11 years... for a single test that doesn't cover refresh, voltage/temperature margining.

      Oh one more thing. Tou are really not sure if when you write a 1, the device stores it as a high charge or a low charge. Without knowing this, you will have to redo that same pattern a BUNCH of times.

      Memtest86 is like a pilot walkaround on a plane. It can spot obvious things, but I sure hope I'm not the first one to fire up that jet engine.

  2. Lotsa cheap ram! by imroy · · Score: 3, Informative

    Solution:

    1. Compile a Linux kernel with the BadRAM patch.
    2. Run Memtest86+ to get a list of bad areas.
    3. Profit!... erm, I mean a Linux system with lots of cheap ram!
  3. Re:Why do people buy cheap ram? by chiph · · Score: 3, Informative

    Same here. I used to buy whatever was cheapest, but after the time that a series of flakey bugs was solved by switching to good quality DRAM, I'll never go back. I probably spent two days troubleshooting it, which at my hourly rate, is many times the amount I "saved" by buying cheap memory.

    Blatant promotion: I've never had a bad stick from Crucial

    Chip H.

  4. Re:unmarked and untested == pirated? by yknott · · Score: 5, Informative

    If you RTFA, the author was saying that these unmarked and untested DRAM chips can later be marked as if they came from a Tier 1 manufacturer. These chips can then be sold for a premium, yet still less than the Tier 1 price. In that case unmarked and untested = pirated.

  5. Re:Why do people buy cheap ram? by vadim_t · · Score: 4, Informative

    I just buy ECC RAM.

    Sure it's more expensive, but it's great. If the computer does something strange I know that I can check /proc/ram or /proc/mc/0, see the statistics and instantly find if the memory is seeing errors or not. Here I do see a corrected error or two sometimes, although very infrequently. But it's indeed very nice to know it's been corrected.

    However, even if it's ECC I still wouldn't like at all knowing that it's not been tested. ECC has limits to the corrections it can make, after all.

  6. The solution is to test it yourself by gotan · · Score: 4, Informative

    Whenever we buy new RAM, mostly as part of new PCs, we run Memtest86. It's easy to do, it takes a while so do it overnight. There's so much that can go wrong with RAM, even with "good" RAM: it might not work together with the board, the SPD-timings might be off, whatever. Every once in a while we find some RAM that doesn't work for us and return it to the shop. We never had any problems at all to get it exchanged.

    For hardware-sellers it's probably more expensive if they have to factor in a certain return-rate (and the overhead for that) so they will look to it that the RAM they buy is ok. That way market forces will work for the benefits of all of us: untested RAM will, in the end, be more expensive than tested RAM. It's much easier and cheaper to do RAMtesting factoryside than having it returned by millions of customers.

    Of course that doesn't work if you buy your PC in a supermarket, but even for cheap PCs it's better to configure them yourself than buying crap. That way you can specify exactly where to save money and if anything breaks you get it fixed much quicker.

    --
    "By the way if anyone here is in advertising or marketing... kill yourself." -- Bill Hicks
  7. Re:Why do people buy cheap ram? by vadim_t · · Score: 3, Informative

    The motherboard supports generating an interrupt when something happens. You can tell it to do that in case of a corrected error, uncorrected error, or never. I think Windows will BSOD when that happens, so I'd just set it to do it on an uncorrectable error. Then it will crash, but at least it will stop things before they mess up something important.

    On Linux you have the ecc-linux(2.4) and bluesmoke(2.6) kernel patches, which will give you a file in /proc you can monitor with detailed statistics about how many errors were corrected. IIRC, without specific support Linux will generate an oops, but continue if the board generates an interrupt. The patches can be told what to do in that case.

    I suppose there must be some software to get all the features on Windows too, but I don't know where to get it.

  8. Re:Why do people buy cheap ram? by jandrese · · Score: 4, Informative

    I was surprised at just how far companies like Kingston have to go to honor their lifetime warentee. I worked for SGI a couple of years ago and I was using a old beat up (8 years obsolete and it still performs decently!) Personal Iris 4D/35 when after a power failure it failed to boot complaining about bad memory. So I pull the thing apart and find that it has an enormous board with 16 SIMM-like slots. I pull out the offending module and notice 2 things:

    1. It is obviously some sort of custom memory module unlike any I had ever seen before, and hasn't been manufactured in years and years.
    2. It has a Kingston Memory sticker on the front.

    So, I decide to see just how good the "lifetime warentee" is. Amazingly enough, they send me an RMA label right away and within days I have a brand new memory module and the system is back up and working perfectly! I was truely amazed that they were still willing to honor their agreement (I've had many bad "lifetime" warentees before where the "lifetime" is defined as 1 year or other BS) without complaint or hesitation.

    --

    I read the internet for the articles.
  9. Re:Also try Prime95 by gkitty · · Score: 5, Informative
    I agree that memtest86 is useful but not sufficient and that prime95 is much more throrough. Memtest confirms that patterns that have been set hold their state briefly, which is a good test against gross failures (and I have seen these).

    But Prime95 confirms that no bit anywhere in nearly the complete memory space ever spuriously changes. I have seen plenty of memory that passes metest86 that fails prime95.

    Based on my experience, Corsair will replace memory that fails prime95. Mushkin will NOT (despite a "lifetime" warranty); they basically told me that memory can't be expected to be 100% perfect all the time and that prime95 was too strenuous; if it passes memtest86 there will be no replacement. My other modules (from Geil, Samsung, and a few old no-name sticks) have always been perfect. IMO it's unconscionable to sell untested ram given how hard it is to return.