Slashdot Mirror


Slashdot Asks: Do You Need To Properly Eject a USB Drive Before Yanking it Out? (daringfireball.net)

In a story earlier this week, Popular Science magazine explored an age-old topic: Do people need to safely eject a USB stick before they pull it from their computer? The magazine's take on it -- which is, as soon any ongoing transfer of files is complete, it is safe to yank out the flash drive -- has unsurprisingly stirred a debate. Here's what the magazine wrote: But do you really need to eject a thumb drive the right way? Probably not. Just wait for it to finish copying your data, give it a few seconds, then yank. To be on the cautious side, be more conservative with external hard drives, especially the old ones that actually spin. That's not the official procedure, nor the most conservative approach. And in a worst-case scenario, you risk corrupting a file or -- even more unlikely -- the entire storage device. To justify its rationale, the magazine has cited a number of computer science professors. In the same story, however, a director of product marketing at SanDisk made a case for why people should probably safely eject the device. He said, "Failure to safely eject the drive may potentially damage the data due to processes happening in the system background that are unseen to the user."

John Gruber of DaringFireball (where we originally spotted the story), makes a case for why users should safely eject the device before pulling it out: This is terrible advice. It's akin to saying you probably don't need to wear a seat belt because it's unlikely anything bad will happen. Imagine a few dozen people saying they drive without a seat belt every day and nothing's ever gone wrong, so it must be OK. (The breakdown in this analogy is that with seat belts, you know instantly when you need to be wearing one. With USB drives, you might not discover for months or years that you've got a corrupt file that was only partially written to disk when you yanked the drive.)

I see a bunch of "just pull out the drive and not worry about it" Mac users on Twitter celebrating this article, and I don't get it. On the Mac you have to do something on screen when you eject a drive. Either you properly eject it before unplugging the drive -- one click in the Finder sidebar -- or you need to dismiss the alert you'll get about having removed a drive that wasn't properly ejected. Why not take the course of action that guarantees data integrity?
What are your thoughts on this? Do you think the answer varies across different file systems and operating systems?

19 of 521 comments (clear)

  1. Depends. by msauve · · Score: 5, Informative

    No mention of the OS or file system. Assuming Windows - there's a setting for "Quick Removal," which disables write caching and makes it so "you can disconnect the device safely", and another for "Better Performance," which doesn't and may cause grief.

    --
    "National Security is the chief cause of national insecurity." - Celine's First Law
    1. Re:Depends. by BenFranske · · Score: 4, Informative

      This is true. My recollection is also that somewhere along the line Microsoft changed the default in Windows. Traditionally in Windows all mass storage devices, think HDDs, had performance enhancing features such as caching turned on which can cause delayed writes while media like floppies had it turned off. The problem is that when USB 2 came out and USB mass storage became feasible people started unplugging USB drives as soon as the copy appeared to be finished even if the OS was really still writing to the drive in the background causing a potential for data corruption. In this era we were teaching everyone to eject USB drives before removing as that would force a clearing of the write cache before giving the OK to remove the drive.

      Somewhere along the line (maybe Windows Vista?) it became apparent that the clumsy drive eject mechanism in Windows, combined with users frequently forgetting to do it, and the increasing popularity of flash drives made this a usability issue. At that point Microsoft changed how Windows handles USB attached mass storage devices and disabled or modified the performance features to flush the write cache as quickly as possible and keep copy dialogs on screen until the files were actually fully copied. At the same time a lot of flash drive manufacturers started putting access indicator LEDs on the drives so you could tell if the drive was being accessed. After this most Windows users stopped ejecting drives before removal and save for an especially odd case there seems to have been little data corruption which can be traced back to not ejecting the drive.

    2. Re:Depends. by thegarbz · · Score: 5, Informative

      My recollection is also that somewhere along the line Microsoft changed the default in Windows.

      Yes, almost 17 years ago they changed that default. Windows hasn't enabled caching and the likes on removable drives since the release of Windows XP.

      The only benefit you get of the safely remove feature is that windows won't let you remove the drive if it is actively being written to. But either way you will know instantly if you end up with a corrupted file if you don't safely remove as you'll get an error message for whatever program was writing.

      Mr Gruber's scenario of hidden corruption just isn't a thing.

    3. Re:Depends. by arth1 · · Score: 1, Informative

      They seem decent at warning that writes are ongoing on USB media

      A problem here is that the USB devices themselves are lying, in part because the manufacturers want to sell them as really fast devices, because the consumers look at that. So buffering is cranked up, and the controllers lie, horribly.
      When the drive tells the OS that yes, everything is committed, it may still be in the controller buffer and not really written. So if you yank out the drive immediately after the OS says it's all done, you can get errors or missing data. And that's not the fault of the OS.

    4. Re: Depends. by c6gunner · · Score: 3, Informative

      If by "very old" you mean Windows 7, and by "odd FS choice" you mean FAT32 and NTFS, then yeah, sure.

      I get that message all the time on the work computers because nobody ever bothers properly dismounting the damn flash drives.

  2. This isn't a debate by Murdoch5 · · Score: 4, Informative

    Depending on the write method in use from the OS, you either have transferred the full contents of what you wanted, or you haven't. In the later, ejecting will finish the write cycle and in the first you're good to go. There is one other rare problem that can arise and that's file-system corruption. and you run the risk of it by just pulling your USB key out, although with any modern file-system, such as EXT4, BTRFS, ZFS, even NTFS, that chances of seeing corruption are rare.

    1. Re: This isn't a debate by tepples · · Score: 5, Informative

      What is "âoesyncâ"? [1]

      Even if you do type sync before ejecting, that doesn't keep some other process from opening and writing another file on the volume in the seconds between when you type sync and when you actually pull the plug. To prevent the possibility of corruption, you need to unmount the volume (and possibly remount it read-only) before disconnecting the drive.

      [1] Rhetorical.

  3. Service Technician by Anonymous Coward · · Score: 3, Informative

    I repair computers and macs for a living and have been doing so for more than 20 years.

    Is it safe to yank a USB drive?
    No, but the severity varies by OS.

    On Linux or Mac? Somewhat, or at least safer than Windows, but I'd go ahead and unmount it anyway just to be safe.
    On Windows? No. You can do it, but you're taking a gamble every time that doing so will break the partition tables. You might be able to fix that and get the data back off of it... or you might not.

    Now, will I do it anyway, knowing the risk?
    Absolutely, and especially on windows. I keep a drive just for moving diagnostic tools or software to windows computers while inside the OS and occasionally, windows decides it doesn't want to unmount it, so I'll yank it anyway. But that's with the understanding that everything on that drive can go 'poof' any time I do it and everything on it is backed up.

    In short:
    If you value what's on that drive, you shouldn't... but you should have more than the one copy on the drive of whatever you're moving around anyway.
    If you don't have backups, it must not have been important.

  4. Depends by AlanObject · · Score: 5, Informative

    I used to write a lot of 8GB+ file system images to 16GB SanDisk devices using an Ubuntu 14.04 system. The caching it did was immense. The dd or the cp command finished in less than 60 seconds but when I did a umount command on the volume it would block for about 5 whole minutes or more while the cache emptied.

    These particular USB drives have a blue activity LED on them so it wasn't hard to figure out what was going on.

    In that use case yanking the USB would have been a big no no.

    1. Re:Depends by Anonymous Coward · · Score: 2, Informative

      I used to write a lot of 8GB+ file system images to 16GB SanDisk devices using an Ubuntu 14.04 system. The caching it did was immense. The dd or the cp command finished in less than 60 seconds but when I did a umount command on the volume it would block for about 5 whole minutes or more while the cache emptied.

      These particular USB drives have a blue activity LED on them so it wasn't hard to figure out what was going on.

      In that use case yanking the USB would have been a big no no.

      This is why dd has the argument "oflag=sync". I suppose you could also use "oflag=nocache", never tried it myself, but the man page suggests it has the same effect.

      Likewise it's easy enough to add "&& sync" or "&& sync && umount /your/device" to the end of a cp command. If you want to monitor it with a glance once in a while, try iotop.

      I have also noticed that if I perform a copy with my GUI file manager the copy notification stays on until it's really completed. On a big copy a USB device will have bursts of speed with slowdowns in-between as blocks are overwritten (a flash drive's charge accumulator takes a bit of time). This is with KDE's Dolphin file manager, though I doubt it's unique.

  5. RAM caches by dissy · · Score: 5, Informative

    Depends how many places "confirmed" saved data can be cached in RAM.

    Spinning hard drives have their own cache, which isn't written to the actual disk until either the cache is "full enough" or you instruct it to do so.
    This data will be lost if you power the device down before the cache is flushed to disk, after the drive reported the data saved.

    Flash sort of depends, an SSD for example tends to do the same thing, but there are different ways it can go about it.

    Older SSDs can only write 4k blocks, it wasn't possible to write less data.
    So to write for example 300 bytes, the controller has to pull in a 4k block to its RAM, edit those 300 bytes, and write out the entire 4k block again.
    Pull power before this is done and your 300 bytes are gone.

    "Removable" USB flash drives, the better ones at least, tend to not report data as saved before it really is, just to help avoid this problem. There is little hope for a non-technical person to know what their particular flash drive is doing however, and not even obvious to technical people either.

    On top of that, your OS likely caches data to be written to any disk in RAM, completely independent of what the disk itself is doing.

    If one is absolutely certain how all of these caches function, and can be completely assured all data is written in a method that doesn't rely on the disk claiming it is or isn't, then in that case it would be safe to power down the device.

    For average and above average users, that will just not be true.
    Even for experts, outside of a small set of cases like highly customized and tuned systems, it may be true the expert knows what is going on, but would tell you from that knowledge it wouldn't be safe and is a silly risk to take, with a high cost of data loss in exchange for a couple of seconds of time saved.

    Hell, even I am still in the habit of issuing three 'sync' commands in a row before an unmount command, and that's despite the knowledge the unmount command will do a 'sync' call of its own!

    But as this advice is for average or below people, as bad as it is, isn't the worst things commonly done.
    Average or below people rarely even make backups, which has a far higher cost when (never IF) a drive fails. Corrupting a couple files on a single USB drive compared to not having any backups is like complaining your car only has 5 airbags instead of 6 while driving it off of a cliff...

    1. Re:RAM caches by dissy · · Score: 4, Informative

      Yes, by older I mean roughly 20 years ago, some of the initial nand devices on the market.

      These days a 128-256kB nand page size is typical.
      I also recall Intel at least saying something about 512kB on some of their higher end offerings, but don't remember what class of devices that was about or if it's one of those "coming soon" things or not.

      Fortunately even though the page size keeps going up, the flash write speeds also are going up, so it doesn't tend to take that much more total time to commit a write by the controller chip.

      The window of opportunity for data loss doesn't change a lot because of that, but the amount of data that can be lost if you yank power during that window can be greater.

  6. YES, eject first by thePsychologist · · Score: 5, Informative

    Ever think of testing this for yourself?

    It's easy. Create a half a dozen or more files of random numbers or use existing large files. I created six files of a million random integers in Python. End result, six files of about 6.9MB each. Create an md5 checksum file when you make them.

    Copy them to a USB stick, and then yank the stick right when the light stops blinking. Plug the USB stick back in. Watch and learn. Easily reproducible phenomenon are:
    * Not all files even appear when the stick is plugged back in
    * Some of the files might appear, but give I/O errors and won't even be complete
    * A Few might pass the checksum integrity test

    I'm on Linux Mint, but I have seen the results on other OSes as well. The OS caches the data to be written sometime, presumably to speed up file operations. There's a reason why eject exists.

    --
    "What lies behind us, and what lies before us are tiny matters compared to what lies within us." Ralph Waldo Emerson
  7. Now Apporoved by majopr educational institutions! by Hognoxious · · Score: 3, Informative

    I don't have a clue, but I'm sure Wikipedia does!

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  8. What are my thoughts on this? by QuietLagoon · · Score: 4, Informative
    Simple: wow, I am surprised how low the once great Popular Science has sunk. Oh, you mean about the USB stuff... well...

    ...The magazine's take on it -- which is, as soon any ongoing transfer of files is complete, it is safe to yank out the flash drive ...

    The problem is that you do not know when the transfer is complete. The UI's representation of it shows when the UI is done with the transfer, but that does not necessarily mean that the OS is finished with the transfer. So Popular Science is correct in that you have to wait until the transfer is complete, they are just incorrect about what tell-tale to use to determine that status.

  9. Re:My thought by Anne+Thwacks · · Score: 4, Informative
    You might be using a file system called FAT - which stands for "File Allocation Table" and is only partially connected with overweight files and the concept of bloatware.

    The FAT is a table (bit map of sectors in use) which says which parts of the disk are in use, as well as where the files and their parts are stored. There are also flags saying if the file system (partition) is in use, and if it has been modified - and probably other things (I was last involved in this in about 1990).

    When you eject, the system stops making the changes you requested (reads and writes), and updates these tables to a consistent state (well maybe not all that consistent, if its Windows, and completely inconsistent if DOS 4.0), and then clears the dirty (modified) and Open (in use) flags for each partition.

    If the eject procedure fails to complete, and the dirty flag is set when you next try to use it, the OS will try to clean up the mess - it may succeed, or trash the data, and report success, or it may fail and tell you the system is dead.

    With USB sticks, the data you see is not the real data. Because of the need to hide the bad blocks, and hideously long write time, the OS sees virtual data, and operates on virtual data. The real data is different. A single bit change (eg clear the dirty bit) could end up requiring three or four copy and write operations, operating on massive amounts of logically unrelated data, unknown to the OS. If you trash those, the OS for the CPU in the USB stick will be utterly bamboozled, and you have no interface to tell it to do a factory reset, so you USB stick is dead.

    (Actually, the underlying interface to the USB stick is SCSI, and there probably is a SCSI command that would unbrick it, but you can be damned sure that the manufacturer won't tell you what it is, because if he did, you would not go out and buy another USB stick, would you?)

    --
    Sent from my ASR33 using ASCII
  10. Sync and unmount. by Ungrounded+Lightning · · Score: 4, Informative

    Copy them to a USB stick, and then yank the stick right when the light stops blinking.

    Two functionalities are key: Sync and unmount.
      - Sync forces the filesystem to complete all in-RAM updates to blocks, write the blocks to the driver, and forces the driver to flush all pending writes to the backing store. Issuing a "sync" kicks off activity that insures the data for the current state is all on the backing store when the activity is completed. (And manually issuing two of them insures the first one is completed before the second command returns.)
      - Unmount also tells the filesystem to get all files closed, go to such a state (including updating a flag saying it isn't mounted, if that is part of the filesystem) and write this all to the store.

    Unixes and linux are usually configured to issue syncs periodically. So if you write new stuff to your filesystem, and your filesystem/driver combination isn't one that tries to always force the data to the store right away, you'll see the activitly light go out when there's still important stuff in a last few buffers which AREN'T written yet. (For a new file, on some filesystems, that will include the metadata about the file, which got finalized when it got closed and thus is the last thing changed.) Pull the store now and important stuff about these new files is incomplete or missing, even if they've been closed (so the last buffer of data is complete and queued to be written).

    When this happens, if you wait around a short time, and the periodic sync will force it out to the store. But it might be so few blocks that you don't see the activity light blink. Pull the store after that and it will have the file data and metadata for the new, and closed, files. But the filesystem image will still be in a "mounted" state, so remounting it will require at least some filesystem scan (very short for journaling filesystems) and may generate a gripe from the OS.

    Unmount (then sync;sync, though unmount seems to do that for you these days) and you guarantee that the backing store is clean and ready to go. Eject does that for you (or gripes if it can't because there are sill things in use by live apps) before it actually ejects the storage medium or tells you you can safely remove it.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  11. Re: yes, by Anonymous Coward · · Score: 2, Informative

    That's just not true, there is a configurable timeout for this, 30 seconds by default.
    See vm.dirty_expire_centisecs.
    Some distros set a lower value or even mount removable drives sync, but that does cost performance.

  12. Re: yes, by c6gunner · · Score: 4, Informative

    The drive records if it was properly ejected or not.

    No, it doesn't. The filesystem on the drive might, though, depending on which filesystem it's formatted with.

    This used to be an issue with many Linux distributions and NTFS in the past. If you tried to mount an NTFS filesystem which wasn't cleanly dismounted, it would throw an error at you and fail. For other filesystems it didn't matter.