What specifically are you trying to achieve? Do you know that (parts of) files you wish to recover are specifically stored in the blocks that are giving read errors? Or are you just trying to get a good copy of the whole disc? If the latter, then you might well be able to get away with using something like ddrescue which can ignore the bad sectors if they don't read correctly after a number of retries. If the former, then I imagine you'll need to look into whether the drive has an interface to the onboard controller (e.g. via RS232 like some Seagate models). As far as I can see, SpinRite is functionally equivalent to ddrescue with modern drives, but may be more useful for old RLL or MFM drives.
The reason for the RMA was I physically broke off the data connector myself (I was rammed by a 120lb dog while installing a new Raptor 150GB), and Seagate didn't have a problem with that whatsoever. They've been really good to me.
Nice to know Seagate's warranty covers Acts of Dog.
If the drive is properly-bricked (i.e. the problem that the current round of firmware updates is designed to fix - not showing up in the BIOS, etc), then it's necessary to connect to the RS232 interface and reset the drive's on-board controller in order to make it show up again. At that point, flashing the firmware is easy. I imagine he was suggesting, similarly, the log file is only accessible via the RS232 interface and not using the ATA command set.
I imagine Maxtorman is referring to connecting to the on-board controller using its RS232 interface. I don't think there's any practical way of making that a suitable end-user tool or process.
Many thanks, Maxtorman. Yours is the first useful information I've had out of Seagate so far, and is much more reassuring than the official KB articles and the 'support' I've received from most of the first line techs I've dealt with at Seagate. I only wish you could show this to your management and take credit for it. I hope that they have the sense to keep you and those like you through the coming upheaval.
Now, a few questions, if I may...
The wording on the 207931 KB article keeps changing; sometimes it's 'manufactured through December 2008', sometimes it's 'manufactured in December 2008'. Which is correct, and does the former mean 'all affected models of drives manufactured upto and including December 2008'?
Could the '320th log file entry' be a SMART self-test log entry, or are these just entries for reallocated blocks, read errors and so on? I ask, because I run a selective test of part of my discs every day.
On a more radical tangent, one of my month-old ST31000333AS drives has a 'High Fly Writes' SMART attribute of 19, the other is 64. Normally, I wouldn't worry about an attribute as long as it's above the threshold for that attribute, but I'm a bit concerned about the very large difference between the two drives. Is the first drive very much less healthy than the second?
That would be a violation of people's liberty, just like telling them they can't drive while drunk.
Speaking of which, that's something I see and notice quite a lot in US-made films and TV drama - people regularly driving after being at a bar drinking for (presumably) some time, and rarely is any comment made about them doing it ("Gremlins" is about the only example I can think of that did). Is this really fairly accepted practice in the US, or just artistic licence?
I've complained when sent HDDs that haven't been packed to make proper use of the specialist packaging the retailer uses. At the very least, it's on the record in case I end up with problems with the drive later and someone tries to use the 'you must have dropped it' excuse to prevent repairing or replacing it.
Oh, they'll work. Until they don't. Regular HDDs will attempt to perform extended recovery when they encounter a difficult-to-read block. TLER discs won't, as the rest of the array should be able to make up for it. What I'm not sure about is whether they return nothing, an error code, or garbage.
Or have you disabled TLER functionality in the RE drives you have in your desktops? If so, I'm not sure there's any benefit over the otherwise-equivalent 'Special Edition' or 'Caviar Black' models.
Having followed the 1.5TB NCQ/Cache Flush issue, it seemed to be necessary to boot from some kind of DOS device (floppy, CD-Rom, USB memory stick). That's not terrible, considering that it might well be difficult or impossible to flash the firmware of a drive that's in use, and have it come back without at least a warm reboot or reset.
I guess Mac owners might have a bit more trouble, unless they can boot DOS natively from EFI. Good job they're using x86 CPUs these days, mind...
I wouldn't particularly bless Samsung on the basis of their consumer electronics experience; some of their products appear to have design faults, for example their HT-DB120 home theatre system which seems to be widely known for spontaneously dying with a 'PROTECTION' front panel message. I had one of these systems and it died about a month outside of the two year warranty. Calling Samsung was an exercise in futility; they simply didn't want to know. To make matters worse, getting at the power amplifier section (which is where I believe the problem lies) would require a big disassembly job as heatsink screws are obscured by the heatsinks for other 5.1 amplifier channels.
it just becomes unresponsive for a while when sent a "FLUSH
CACHE EXT" command. Not sure how long, but long enough to cause problems obviously (e.g. get kicked out of RAID arrays).
I have an SD17 firmware 1.5TB which I'm trying to return to the retailer for this reason.
If you're prepared to deal with Seagate tech support, I believe the SD1A (and greater) firmware revisions fix this particular problem.
There appear to be two separate problems here; the NCQ/FLUSH CACHE EXT problem you describe that Seagate publically acknowledge affects the 1.5TB models, and this new one, where according to another poster, the drive's controller gets locked into a BUSY state and disappears off the ATA bus, permanently (or at least until you connect to the on-board serial port and reset it).
Having a Seagate fail on me recently, too, I've been pretty annoyed off over the requirements. You want the thing covered in two inches of fucking rubber foam? Fucking ship your oem drives like that, then, instead of bubble wrap, if it's so necessary.
The bubble wrap sounds like your retailer is doing a lousy job of packing the OEM drives they sell; the vendors I use send their OEM drives individually in fairly robust cardboard boxes with lots of foam. I believe the retailers get OEM drives wrapped in individual anti-static bag in a large polystyrene tray, and it's up to them to package them for retail. Of course, doing so with bubble wrap is almost certainly cheaper than buying in special-purpose packaging for the job. Try a better retailer.
According to the linked Seagate Knowledge Base article, this is a new problem to the NCQ/CACHE FLUSH issue that Seagate publically acknowledge affects the 1.5TB models. This new problem apparently affects lots of current models.
I avoided the 1.5TB models like you, but the 1TB Seagate drives I bought instead turn out to be affected by this new problem (hopefully I can get a pre-emptive fix from Seagate before they fail BSY). D'oh!
The other posters who describe problems with HDDs moving cyclically from brand-to-brand are correct. I have a 1993-vintage 120MB Maxtor SCSI drive in an Amiga sidecar that has stiction problems, and a 1995-vintage 1.275GB WD ATA drive from my first PC that has similar problems.
Some wags even suggest that there's only one truly competent drive design team in the industry, and they get sequentially headhunted en masse by a given manufacturer that's hit rock bottom and realises they need to do something to improve.
It's a shame that you have managed to return to Seagate just as they're having problems again - over the last few years, their products have been excellent IME, and with a long warranty too.
How about the Seagate 1500GB drive hang error? To my understanding Windows has been fixed, but the problem still persists in Linux.
The ST31500341AS requires a firmware update from Seagate to something newer than revision SD19 (more info). In the meantime, if you're using a drive which hasn't been updated to fixed firmware, there's a blacklist in the current development kernel to disable NCQ on affected models as a workaround.
The xbox360 is certainly more prone to scratching than any other device I've ever had. I've never seen a scratch in a disc like the one it made. If Microsoft knew about it (they certainly know now!), I would hope they've fixed it in the current builds, because its a serious design flaw.
No, it typically won't. You'll either need to recompile the custom driver (if it was provided in source form), recompile the source code wrapper (if it was provided as a binary blob with a wrapper), locate and install an updated driver for the new kernel revision (if it is only provided as a binary blob), or just not upgrade your kernel.
The vendor should probably be using DKMS; although what you describe will be happening behind the scenes, the user/admin shouldn't have to do it manually.
Your statements imply that Windows is Jersey/C/Worse and Linux is Lisp/MIT/Better (because the Linux camp keeps the architecture clean at the cost of downstream effort). I'm not sure if I am mis-reading your comments, or if your logic is reversed.
The attributes of the Jersey approach apply in these ways a) Simplicity (i.e. we won't support a dozen old APIs) a) Correctness (i.e. if we designed an API badly in the past, we'll fix it, even if it breaks your binaries and you need to recompile them against the latest headers) c) Consistency (i.e. combination of a) and b) above) d) Completeness (i.e. if we can support your driver we will, but not if it means compromising on any of the previous attributes).
Of course, like most of these concepts, you can interpret it how you like depending how you look at it (cf. the Project Triangle).
What specifically are you trying to achieve? Do you know that (parts of) files you wish to recover are specifically stored in the blocks that are giving read errors? Or are you just trying to get a good copy of the whole disc? If the latter, then you might well be able to get away with using something like ddrescue which can ignore the bad sectors if they don't read correctly after a number of retries. If the former, then I imagine you'll need to look into whether the drive has an interface to the onboard controller (e.g. via RS232 like some Seagate models). As far as I can see, SpinRite is functionally equivalent to ddrescue with modern drives, but may be more useful for old RLL or MFM drives.
Nice to know Seagate's warranty covers Acts of Dog.
Thank you, I'll be here all week...
If the drive is properly-bricked (i.e. the problem that the current round of firmware updates is designed to fix - not showing up in the BIOS, etc), then it's necessary to connect to the RS232 interface and reset the drive's on-board controller in order to make it show up again. At that point, flashing the firmware is easy. I imagine he was suggesting, similarly, the log file is only accessible via the RS232 interface and not using the ATA command set.
I imagine Maxtorman is referring to connecting to the on-board controller using its RS232 interface. I don't think there's any practical way of making that a suitable end-user tool or process.
Thanks again, Maxtorman! All very useful stuff. I hope your answers to my questions are useful to others as well. :-)
Many thanks, Maxtorman. Yours is the first useful information I've had out of Seagate so far, and is much more reassuring than the official KB articles and the 'support' I've received from most of the first line techs I've dealt with at Seagate. I only wish you could show this to your management and take credit for it. I hope that they have the sense to keep you and those like you through the coming upheaval.
Now, a few questions, if I may...
Mergers and acquisitions, rifts between executives, poor economic climate, multiple restructurings - the usual.
Speaking of which, that's something I see and notice quite a lot in US-made films and TV drama - people regularly driving after being at a bar drinking for (presumably) some time, and rarely is any comment made about them doing it ("Gremlins" is about the only example I can think of that did). Is this really fairly accepted practice in the US, or just artistic licence?
I've complained when sent HDDs that haven't been packed to make proper use of the specialist packaging the retailer uses. At the very least, it's on the record in case I end up with problems with the drive later and someone tries to use the 'you must have dropped it' excuse to prevent repairing or replacing it.
Oh, they'll work. Until they don't. Regular HDDs will attempt to perform extended recovery when they encounter a difficult-to-read block. TLER discs won't, as the rest of the array should be able to make up for it. What I'm not sure about is whether they return nothing, an error code, or garbage.
Or have you disabled TLER functionality in the RE drives you have in your desktops? If so, I'm not sure there's any benefit over the otherwise-equivalent 'Special Edition' or 'Caviar Black' models.
Having followed the 1.5TB NCQ/Cache Flush issue, it seemed to be necessary to boot from some kind of DOS device (floppy, CD-Rom, USB memory stick). That's not terrible, considering that it might well be difficult or impossible to flash the firmware of a drive that's in use, and have it come back without at least a warm reboot or reset.
I guess Mac owners might have a bit more trouble, unless they can boot DOS natively from EFI. Good job they're using x86 CPUs these days, mind...
I wouldn't particularly bless Samsung on the basis of their consumer electronics experience; some of their products appear to have design faults, for example their HT-DB120 home theatre system which seems to be widely known for spontaneously dying with a 'PROTECTION' front panel message. I had one of these systems and it died about a month outside of the two year warranty. Calling Samsung was an exercise in futility; they simply didn't want to know. To make matters worse, getting at the power amplifier section (which is where I believe the problem lies) would require a big disassembly job as heatsink screws are obscured by the heatsinks for other 5.1 amplifier channels.
it just becomes unresponsive for a while when sent a "FLUSH CACHE EXT" command. Not sure how long, but long enough to cause problems obviously (e.g. get kicked out of RAID arrays).
I have an SD17 firmware 1.5TB which I'm trying to return to the retailer for this reason.
If you're prepared to deal with Seagate tech support, I believe the SD1A (and greater) firmware revisions fix this particular problem.
Use hdparm or smartctl to determine the serial numbers and firmware revisions, and get in touch with Seagate.
RE drives are "unsuitable for desktop use", due to the TLER feature that is an advantage in RAID setups.
There appear to be two separate problems here; the NCQ/FLUSH CACHE EXT problem you describe that Seagate publically acknowledge affects the 1.5TB models, and this new one, where according to another poster, the drive's controller gets locked into a BUSY state and disappears off the ATA bus, permanently (or at least until you connect to the on-board serial port and reset it).
Having a Seagate fail on me recently, too, I've been pretty annoyed off over the requirements. You want the thing covered in two inches of fucking rubber foam? Fucking ship your oem drives like that, then, instead of bubble wrap, if it's so necessary.
The bubble wrap sounds like your retailer is doing a lousy job of packing the OEM drives they sell; the vendors I use send their OEM drives individually in fairly robust cardboard boxes with lots of foam. I believe the retailers get OEM drives wrapped in individual anti-static bag in a large polystyrene tray, and it's up to them to package them for retail. Of course, doing so with bubble wrap is almost certainly cheaper than buying in special-purpose packaging for the job. Try a better retailer.
According to the linked Seagate Knowledge Base article, this is a new problem to the NCQ/CACHE FLUSH issue that Seagate publically acknowledge affects the 1.5TB models. This new problem apparently affects lots of current models.
I avoided the 1.5TB models like you, but the 1TB Seagate drives I bought instead turn out to be affected by this new problem (hopefully I can get a pre-emptive fix from Seagate before they fail BSY). D'oh!
Gathering model and serial numbers from Linux can easily be done using hdparm or smartctl.
The other posters who describe problems with HDDs moving cyclically from brand-to-brand are correct. I have a 1993-vintage 120MB Maxtor SCSI drive in an Amiga sidecar that has stiction problems, and a 1995-vintage 1.275GB WD ATA drive from my first PC that has similar problems.
Some wags even suggest that there's only one truly competent drive design team in the industry, and they get sequentially headhunted en masse by a given manufacturer that's hit rock bottom and realises they need to do something to improve.
It's a shame that you have managed to return to Seagate just as they're having problems again - over the last few years, their products have been excellent IME, and with a long warranty too.
How about the Seagate 1500GB drive hang error? To my understanding Windows has been fixed, but the problem still persists in Linux.
The ST31500341AS requires a firmware update from Seagate to something newer than revision SD19 (more info). In the meantime, if you're using a drive which hasn't been updated to fixed firmware, there's a blacklist in the current development kernel to disable NCQ on affected models as a workaround.
The URI for the ISO is in the page source.
I bet they haven't, but I also bet that it's another factor - besides the death of HD-DVD - behind the rumoured assembly of Blu-ray equipped 360s by Pegatron
The vendor should probably be using DKMS; although what you describe will be happening behind the scenes, the user/admin shouldn't have to do it manually.
Your statements imply that Windows is Jersey/C/Worse and Linux is Lisp/MIT/Better (because the Linux camp keeps the architecture clean at the cost of downstream effort). I'm not sure if I am mis-reading your comments, or if your logic is reversed.
The attributes of the Jersey approach apply in these ways a) Simplicity (i.e. we won't support a dozen old APIs) a) Correctness (i.e. if we designed an API badly in the past, we'll fix it, even if it breaks your binaries and you need to recompile them against the latest headers) c) Consistency (i.e. combination of a) and b) above) d) Completeness (i.e. if we can support your driver we will, but not if it means compromising on any of the previous attributes).
Of course, like most of these concepts, you can interpret it how you like depending how you look at it (cf. the Project Triangle).