The Lies Disks and Their Drivers Tell
davecb writes "Pity the poor filesystem designer: they just want to know when their data is safe, but the disks and drivers try so hard to make I/O 'easy' that it ends up being stupidly hard. Marshall Kirk McKusick writes about the difficulties in making the systems work nicely together: 'In the real world, many of the drives targeted to the desktop market do not implement the NCQ specification. To ensure reliability, the system must either disable the write cache on the disk or issue a cache-flush request after every metadata update, log update (for journaling file systems), or fsync system call. Both of these techniques lead to noticeable performance degradation, so they are often disabled, putting file systems at risk if the power fails. Systems for which both speed and reliability are important should not use ATA disks. Rather, they should use drives that implement Fibre Channel, SCSI, or SATA with support for NCQ.'"
But you lost me the moment you mentioned ATA drives.
Cheap, fast and reliable.
Pick any two.
We're talking about ATA drives?
As in non-SATA drives?
Who has those anymore?
While the article is good for publication in an academic journal like ACM, it's useless for the real world.
For that, the author should tell us whether most drives on the market have NCQ already or not. Popular drives like WD Green and Seagate's various lines.
Otherwise, saying "$A is useless without $Y" is pointless.
I'm not a lawyer, but I play one on the Internet. Blog
We shouldn't even be writing for ATA drives anymore. And any name brand manufacturer that you would trust (on a mediocre level) WD, Seagate etc... all support NCQ.
Get your PostgreSQL here: http://www.commandprompt.com/
Don't assume that "enterprise" disks do this correctly either.
Many have options to make them behave properly but out of the box have write back caches and ignore FUA or similar, leading to the same problems.
I never recommended ATA drives for servers. Really old stuff that used MFM and RLL drives was back in the era where the just anything else. I used ATA drives for my home stuff and lab where it wasn't expected to be very reliable, and SCSI was all I used for a very long time. Even today I recommend against SATA though it seems tolerable, but SCSI drives are still my standard.
Mostly I thought SCSI drives were also made better, but Seagate and WD convinced me otherwise.
And yes, MFM drives in a Novell DCB setup were among my first servers. Making NW 2.15c mount a 4 GB volume just so you can say you did it would not be fun today, but back then it was work, and clients paid for it. I'm glad it wasn't a VINES server.
deleting the extra space after periods so i can stay relevant, yeah.
2) The article's point on NCQ is that many consumer drives do not implement it correctly, and disable the write cache on the disk and issue cache-flush requests to increase performance, but leading to possible file-system failures if there is a power outage.
I think this article is saying that for the enterprise, buy enterprise drives, not consumer drives. Most consumers use laptops now, so power failure doesn't fit in, and consumers prefer speed over reliability, which is why I've always been stuck using laptops lacking ECC RAM.
Slashdot: Playing Favorites Since 1997
Native Command Queueing
Because not everybody knows everythingTM
systemd is Roko's Basilisk.
The people who make hardware RAID know all about the lying drives, they get good information from the manufacturer on how to make the drives play nice with the RAID controller.
Just read the compatibility charts for your RAID controller, many drives have footnotes with minimum drive firmware requirements and other odd behavior.
I think this is quite interesting.
http://yarchive.net/comp/linux/drive_caches.html
While I've often gotten the impression that the write cache opens up a large "write hole", Linus says that data is cached only for milliseconds, not held in the cache for several seconds. Still, I'd like to see battery backed caches in regular drives and/or controllers.
Would be nice to hear from some drive firmware writers.
The article is total crap, every disk supports NCQ as half the world's population has pointed out in the comments.
The problems are elsewhere: When a disk suddenly loses power while it is writing, there is a risk of various interesting errors. The disk may a) write nulls instead of the correct data, b) write garbage instead of the correct data, c) fail in the middle of a Read-Modify-Write operation and therefore destroy data in files which weren't written to at all, d) write good data to the wrong place on the disk, e) write garbage to a random spot on the disk. Sometimes you are lucky and the errors result in bad hardware checksums so you know you have lost data, at other times the wrong data gets the correct checksum.
In practice, very few desktop/notebook/whatever users will see these problems. No reviews test for these types of errors, so you cannot try to buy drives which fail in less harmful ways. If you care enough, you will use file systems with checksumming designed to catch all the above errors and more (Btrfs and ZFS come to mind). They will at least notify you that it happened, and depending on the redundancy settings they may be able to rescue the destroyed data.
Finally! A year of moderation! Ready for 2019?
Given that we are talking about Kirk McKusick an appeal to authority is entirely fair. Just because he didn't have a bunch of citations or references listed at the bottom of the article does not mean they do not exist somewhere. For you to say it is a "fallacious" appeal to authority is unfair - it has not been proven as fallacious
It's usually up to the one who makes a claim to back it up with evidence, not for others to disprove it - and they can't either, because there's falsifiability here. If I show that my drive has NCQ that works, that still doesn't falsify his claim. I can't bloody well test every drive on the planet, so there's no way to disprove him.
So yes, this is appeal to authority and what you do is putting the onus on those who disagree to prove a negative.
He may be right, and he's certainly renown, but to jump from there to "therefore he is right" is bunk. Even Einstein and Feynman make wrong claims. No one is immune. So some evidence would be welcome.