Slashdot Mirror


Your Hard Drive Lies to You

fenderdb writes "Brad Fitzgerald of LiveJournal fame has written a utility and a quick article on how all hard drives from the consumer level to the highest level 'enterprise' grade SCSI and SATA drives do not obey the fsync() function. Manufacturers are blatantly sacrificing integrity in favor of scoring higher on 'pure speed' performance benchmarking."

10 of 512 comments (clear)

  1. Re:Err... "lying" is the default setting. RTFM. by Yokaze · · Score: 4, Insightful

    No. If you had no cache, there would be no need for a flush command. The flush command exists purely for the reason of flushing buffer and caches on the harddisc. The ATA-5 specifies the command as E7h (and as mandatory).

    The command is specified in practically in all storage interfaces for exactly the reason the author cited, integrity. Otherwise, you can't assure integrity without sacrificing a lot of performance.

    --
    "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
  2. Re:Why do we need it? by Erik+Hensema · · Score: 5, Insightful

    We need it because of journalling filesystems. A JFS needs to be sure the journal has been flushed out to disk (and resides safely on the platters) before continuing to write the actual (meta)data. Afterwards, it needs to be sure the (meta)data is written properly to disk in order to start writing the journal again.

    When both the journal and the data are in the write cache of the drive, the data on the platters is in an undefined state. Loss of power means filesystem corruption -- just the thing a JFS is supposed to avoid.

    Also, switching off the machine the regular way is a hazard. As an OS you simply don't know when you can safely signal the PSU to switch itself off.

    --

    This is your sig. There are thousands more, but this one is yours.

  3. Re:An acceptable alternative. by Sparr0 · · Score: 5, Insightful

    You have no grasp of what 'kilo', 'mega', and 'giga' mean. They have meant the same thing for 45 years, computers did not change that. There is a standard for binary powers, you simply refuse to use it.

  4. Re:Err... "lying" is the default setting. RTFM. by Basehart · · Score: 4, Insightful

    "I remember that MS had a fix for this (for laptops etc)... Which just made Windows wait a duration (~30s)..."

    This turned into the "my computer isn't doing what I want it to do, which is turn the F off" at which point the consumer simply reached down and yanked the power cord.

    Try writing a routine for this routine!

  5. Re:Why do we need it? by bgog · · Score: 4, Insightful

    The author is specifically talking about the fsync function not the ATA sync command. fsync is an OS call notifying the system to flush it's write caches to the physical device. This writes to the disks write cache but I don't believe it actually issues the sync command to the drive.

    In the case of a journaling file system they issue the sync command to the drive to flush the data out.

    I work on a block-level transactional system that requires blocks to be synced to the platters. There where two options, modify the kernel to issue syncs to the ata drives on all writes (to the the disk in question) or to just disable the physical write cache on the drive. Turned out to be a touch faster to just diable the cache but the two are effectivly equal.

    However drives operate fine under normal conditions, applications write to file systems which take care of forcing the disks to sync. fsync (which the author is talking about) is an OS command and not directly related to the disk sync command.

  6. Marketing created the 'confusion' by Dogtanian · · Score: 4, Insightful

    Actually, it is. The standard was updated in 1998 to avoid confusion. Having different name for different things can avoid an awful lot of confusion, so it would very much recommend using them.

    Which is more important? The de facto standard that slightly misuses the 'kilo-' prefix, but *everyone* knows what it means; or something that was forced into place by marketing?

    As I argued in more depth elsewhere, anyone who used computers *knew* what "kilobyte" and friends meant.

    There was no confusion, because only the 1024-byte definition was widely used.

    The 'need' to use the '1000 byte' definition was created by marketing, not computer people. THEY caused the confusion for their (short term) gain by exploiting the old meaning of 'kilobyte' to make their drives seem larger.

    Marketing do not give a flying **** about correctness or clarity; if there was any problem, *they* created it. Computer people knew what kilobyte meant.

    --
    "Slashdot - News and Chat Sites Deviant". (Click "homepage" link above for details).
    1. Re:Marketing created the 'confusion' by Crayon+Kid · · Score: 4, Insightful

      Marketing do not give a flying **** about correctness or clarity; if there was any problem, *they* created it. Computer people knew what kilobyte meant.

      I'm sure they took advantage of the blurry meanings for a while. But in the long run, you gotta admit the change makes sense, from a standardisation point of view. Every measuring unit uses kilo/mega/giga to mean powers of ten. Computer world was the odd one out, and it should rightly be labeled specifically.

      --
      i ate crayons when i was a kid and now i have two braincells and the blue ones taste nicer
    2. Re:Marketing created the 'confusion' by quantum+bit · · Score: 4, Insightful

      Every measuring unit uses kilo/mega/giga to mean powers of ten. Computer world was the odd one out, and it should rightly be labeled specifically.

      Oh, the computer world uses those prefixes to mean powers of 10 too. They just mean powers of 10 in base 2 math :)

    3. Re:Marketing created the 'confusion' by barawn · · Score: 4, Insightful
      As I argued in more depth elsewhere, anyone who used computers *knew* what "kilobyte" and friends meant.

      Except Ethernet card manufacturers, modem manufacturers, PCI card manufacturers... oh, hell, just about anyone who transfers something with a clock.

      10baseT ethernet transfers data at 10 Mbps. That means 10 x 10^6 bits per second. IDE buses running at 66 MHz list their theoretical maximum as 66 MB/s.

      kilo = 1024 is retarded. It only makes sense for things that have to scale in powers of two, like memory. For a long while, "data rate" meant "kilo=1000, mega=1000 kilo" wheras in storage, "kilo=1024". Talk about a recipe for disaster.

      Just as an example: here's an article describing Ultra320 SCSI, and PCI bus bandwidth:

      Under standard PCI the host bus has a maximum speed of 66 MHz. This allows for a maximum transfer rate of 533 MB/sec across a 64-bit PCI bus.


      66 2/3 MHz (M here means what? oh, right, 10^6) times 8 bytes is 533 1/3 MB/s. Where here, "M" means "1000*1000". In MiB/s, it'd be 508.6263 MiB/s.

      Is this a problem? Yes. I shouldn't have to pull out a freaking calculator to figure out how long it should take to dump 2 GB of RAM across a 2 GB/s link. It should be one second, not 1.0737418 seconds.

      Computer people knew what kilobyte meant.

      No we didn't. We've never used kilo consistently. See above - we've talked about CPU speeds in terms of kHz and MHz, meaning 10^3, 10^6, and talked about kilobits/second meaning 10^3 bits per second, talked about kilobytes/second meaning 10^3 bytes/second, and turned around and talked about file sizes where kilobyte means 1024 bytes.

      We've never been consistent. The IEC finally owned up to it and admitted it, and asked us to all finally stop being so damned sloppy, and I'm quite glad they did.
  7. Re:Why do we need it? by swmccracken · · Score: 5, Insightful

    This writes to the disks write cache but I don't believe it actually issues the sync command to the drive.

    Yeah - that's the point of this thing - what's supposed to happen with fsync? From memory, sometimes it will guarentee it's all the way to the platters, sometimes it will not, depending on what storage system you're using, and how easy such a guarentee is to make.

    Linus in 2001 discussing this issue - it's not new. That whole thread was about comparing SCSI against IDE drives, and it seemed that the IDE drives were either breaking the laws of physics, or lying, but the SCSI drives were being honest.

    From hazy memory, one problem is that without tagged-command-queing or native-command-queuing, one process issuing a sync will cause the hard drive and related software to wait until it has fully synched for all i/o "in flight"; holding up any other i/o tasks for other processes!

    That's why fsync often lies; because it's not pratical for people that fsync all the time to flush buffers to screw around with the whole i/o subsystem, and apparently some programs were overzealous with calling fsync when they shouldn't.

    However, with TCQ, commands that are synched overlap with other commands, so it's not that big a deal (other i/o tasks are not impacted any more than they would by other, unsynchronised, i/o). (Thus, with TCQ, fsync might go all the way to the platters, but without it it might just go to the IDE bus.) SCSI has had TCQ from day one, which is why a SCSI system is more likely to sync all the way than IDE.

    If I'm wrong, somebody correct me please.

    Brad's program certainly points out an issue - it should be possible for a database engine to write to disk and guarentee that it gets written; perhaps fsync() isn't good enough - be this fault in the drives, the IDE spec, IDE drivers or the OS.