Slashdot Mirror


Replacing Atime With Relatime in the Kernel

eldavojohn writes "Our friend Jeremy at the Kernal Trap has dug up some interesting criticism of atime from Linus Torvalds. As Linus submitted patches to improve relatime he noted: 'I cannot over-emphasize how much of a deal it is in practice. Atime updates are by far the biggest IO performance deficiency that Linux has today. Getting rid of atime updates would give us more everyday Linux performance than all the pagecache speedups of the past 10 years, _combined_.' And later severely beat atime about the head with a pointed stick: 'It's also perhaps the most stupid Unix design idea of all times. Unix is really nice and well done, but think about this a bit: 'For every file that is read from the disk, lets do a ... write to the disk! And, for every file that is already cached and which we read from the cache ... do a write to the disk!'" Well, I guess I can expect my Linux machine to become a little bit faster!"

27 of 416 comments (clear)

  1. Personally by Nikron · · Score: 5, Interesting

    After I mounted my system with nodiratime and noatime, I did not 'feel' any actual speed increase. I didn't did any hard testing of course.

    --
    Disclaimer: Disregard the above post.
    1. Re:Personally by Anonymous Coward · · Score: 3, Interesting

      What's more is that any performance sensitive partition is already mounted with noatime and nodiratime flags. It's only ever going to be a performance issue for those that neglect to turn it off.

      atime is useful for deleting files from working directories, I use it on my laptop graphic design partition. Anything untouched for 2 months is deleted by a shell script to free up space. The backups and archives are also mounted with atime, its usefulness to me in this role far outweighs the performance penalty.

    2. Re:Personally by WNight · · Score: 2, Interesting

      I checked what was mounted and found an extrernal 500GB via USB without noatime so I decided to try a test.

      I ran a 'find . > /dev/null' on the external drive. There were 250k files totallying roughly 450GB on ReiserFS.

      The first run took 50s, then it quickly stabilized between 1.5s and 6.5s, mostly around 4s. The cache obviously made a huge difference.

      Then I remounted with noatime and reran the test. It was very consistently at just under 0.7s.

      So, between 1.08 times faster and 6.4 times depending on if the reads are already cached.

  2. Why not just fix the filesystem instead? by vox_soli · · Score: 2, Interesting

    Why should atime updates have to be written out to disk immediately? It probably isn't the end of the world if a few get lost if a filesystem doesn't get unmounted cleanly, and it probably updates a *lot* more frequently than anything else in the inode, so why not just have the filesystem keep the atimes separately from the rest of the metadata somewhere? It would only take a little bit of space to hold all the atimes on most filesystems (4 bytes per atime times say 250,000 files plus 5% for indexing overhead (you'd have to map inode numbers to indices into the array of atimes) is just a little over a megabyte), so if you just set that aside somewhere, cached a copy in memory somewhere, and wrote out updates whenever there was some free bandwidth to the disk, you'd be able to merge updates for many different files together instead of having to write out an entire block for every atime update, that you have to write out immediately because it counts as an inode update and it'd be bad to let those fall out of sync.

  3. Re:Ummm.. by AKAImBatman · · Score: 3, Interesting

    Yeah, I finally read the article. I thought I remembered that Linux could turn off atime, but I wasn't going to commit to it until I was sure.

    FWIW, the Relative Access Time (relatime) patch simply doesn't update the access time unless the file has been modified since the last atime write. That allows ancient applications like MUTT to still synchronize on various files. Synchronization that does not work with noatime set.

    Of course, I have to question why they're still using something as ancient as MUTT. A nice event system would be 1000x more efficient than trying to synchronize on flat files stored in your home directory on the file system. Of course, that would require designing OSes beyond the standard UNIX/POSIX philosophy and design. So I doubt we'll see that in Linux any time soon.

  4. Re:Why not a better "atime" instead of "noatime" by Tacvek · · Score: 2, Interesting

    I see no reason why atime updates can't be postponed until some moment other metadata has to be flushed, or once a minute, whatever comes first.
    The exactness of atime might suffer, but nobody will notice.
    That said, I agree the noatime mount option covers most needs. Almost nothing actually uses atime. The Relatime patch therefore only updates atime if mtime or ctime is newer (to show that the file has been accessed since it has been updated or created) or the current atime is over 1 day old. This increases IO usage only slightly compared to noatime, but is less likely to break those few applications that do use atime.
    --
    Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
  5. if you store them up you could lose them all by Anonymous Coward · · Score: 1, Interesting

    if you lose power you're in trouble because now you have all those pending (how many?) atimes that you just lost. and you don't know how many and for what files so if you rely on atime for something then you suddenly no longer can for any of those files (but which ones? - you have no way of knowing)..

  6. Re:Probably overblown by Yokaze · · Score: 4, Interesting
    No, it is not overblown, because, as Alan Cox put it:

    Ext3 currently is a standards compliant file system. Turn off atime and its very non standards
    compliant, turn to relatime and its not standards compliant but nobody will break (which is good)

    It is no option for the kernel to make noatime standard, as it would brake POSIX compliance. But without noatime, the OS suffers a large penalty compared to some other OSs. The magnitude of the penalty has been made clear in the quote from the article.
    So a solution, which is POSIX compliant, but doesn't suffer as much a penalty as the current solution is sought for.
    --
    "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
  7. Re:atime vs ctime by clem · · Score: 2, Interesting

    Well, add enough features and a filesystem begins to look like a source control system. I don't see any advantage by having file creation times tracked by the kernel -- better to do it in user space.

    --
    Your courageous and selfless spelling corrections have made me a better person.
  8. Re:Why not a better "atime" instead of "noatime" by Anonymous Coward · · Score: 1, Interesting

    You must be new here. Anyway, with sentences like "Getting rid of atime updates would give us more everyday Linux performance than all the pagecache speedups of the past 10 years", the story gives a pretty orthogonal impression to the article (which is not about improvements nearly as big as alluded to in the story, but about improvements to relatime, which is an existing option to avoid the overhead of atime with some of the benefits of atime).

  9. Re:atime vs ctime by EvanED · · Score: 5, Interesting

    There is a technical reason for this.

    A lot of the time, modification of a file... isn't a modification of a file. Instead, the program will delete the existing file and create a new one in its place. (There is sometimes other operations in there, like saving to a temp file, deleting the original, then renaming the temp file to the original file name.)

    This means that storing the real creation time of a file means that it won't be what you expect, because the file that you think is the same file actually isn't.

    (MS-DOS/Windows have something called filesystem tunneling to attempt to get around this problem. If a file is deleted and a new one created in its place (see the MSDN article linked to from there for details) within a default 15 seconds, the creation time of the old file is carried over. This technique exchanges purity and absolute correctness (not that metadata times are reliable against tamper anyway) for utility.)

  10. Re:Ummm.. by mce · · Score: 2, Interesting

    It's good for finding files that are not being accessed anymore (doh). This may not be a big issue on a personal desktop, but in a corporate network with tons of centrally installed tools - and even more versions thereof - it very much is. I've made in the past extensive use of it for exactly that purpose.

    It's also good for automatically getting rid of the many defunct temporary files people tend to leave behind in shared directories such as /tmp. Again think multi-user machines.

    I've also used it as a debugging feature. Both to debug software and to de bug wetware (i.e. brains of people who are convinced they are telling their computer to do one thing while actually telling it something else).

  11. Re:atime vs ctime by EvanED · · Score: 5, Interesting

    I know self-replies are stupid, but I should have mentioned something else. The metadata tunneling that Windows does is much more important than it is on Unix because the filename may need to be tunneled as well. If you open a file called "somefile with a long name.txt" in an old DOS program by opening SOMEFI~1.TXT (or even in a recent program) and it does this delete/create thing, you don't want the OS to say "oh, you're making a new file called SOMEFI~1.TXT. Spiffy"; you need the original "somefile with a long name.txt" name to carry over.

    The linked site explains all this, but I know the propensity of /. readers (myself included) to RTFA. ;-)

  12. Is this reason why we cant spin down disks? by grims · · Score: 4, Interesting

    On of my gripes with Linux is that one cannot spin down the disks to lessen their wear and tear.
    Ive been told that the kernel constantly needs to access the disk...

    Is this the reason of something else prevents the disks from spinning down?

  13. Re:Probably overblown by Maltheus · · Score: 3, Interesting

    For the last few years, however, most Linux distributions have discouraged having multiple partitions

    Really? I can't see why. In the past, I didn't really see the point of multiple partitions but with the choice of filesystems now, they make a huge difference. Putting my /usr/portage (Gentoo) dir in a separate ReiserFS partition makes system updates a lot faster. And dealing with large video files on anything less than XFS is a slow PITA, especially when trying to delete the associated transcoding temp files. Maybe these things don't affect the average user much, but computers are never fast enough for me and I'll take what I can get.

  14. Re:there was no "noatime" option before? by DrXym · · Score: 2, Interesting
    In the various BSD flavors you can mount volumes "noatime", which is generally safe and does a pretty good job of keeping things moving. If you really need atime updates you can always remount the volume, but frankly not many people use it from what I've seen (maybe tail -f?).

    I wish Unix (including Linux) would be more proactive about culling options which are not used by the majority and have such a detrimental effect. If the disk performance leaps by 10% by not using atime, it seems a no-brainer that noatime should be the default. If some app uses atime, it should be fixed or deprecated. There is no real excuse for dragging the whole system's performance for the sake of some crummy mail app.

  15. Re:Ummm.. by Cheesey · · Score: 4, Interesting

    Of course, I have to question why they're still using something as ancient as MUTT.

    The thing I really *really* like about Mutt is that it uses Unix mailbox files. These are not just human-readable, they can also be manipulated using tools like 'cat'. I periodically archive my working mail files into a backup directory by just concatenating the working files onto the archive files of the same name. The resulting archive mail files are still fully usable with Mutt, even though some are 100Mb+ in size with mail going back to 1997. I could also use them with even older mail clients such as Pine, but that would be like using vi when you've got vim.

    When I initially began using Linux, I used the Netscape email software, got fed up with it, and tried a few other mail clients. But none of them came close to the flexibility of Mutt. They all used their own mailbox formats, which could not be archived in the way I just described. I suspect that this is still true. I'm not going to trust Opera, Sylpheed, Thunderbird or Evolution with my mail, because (a) I doubt present-day mail files will still work reliably in 10 years time, (b) I can't easily migrate to another mail reader, and (c) I can't efficiently archive my email because the database files are not plain text.

    That's why I like Mutt, anyway.

    --
    >north
    You're an immobile computer, remember?
  16. Accounting is a nuisance in general by mi · · Score: 4, Interesting

    But sometimes you need it... Whether it is to project your savings or to figure out, if a particular file was read within the last year.

    My problem with atime is that it is not universal enough. For example, reading a file via mmap() or sending it directly to a socket via sendfile() (both methods widely used by web-servers) will not update its atime. The access-timestamp should be updated every time a file is opened for reading, rather than when a read() is issued on it...

    So, when I wanted to report, when my little piece of software was last downloaded (via HTTP), I could not, unfortunately, rely on the file's atime...

    --
    In Soviet Washington the swamp drains you.
  17. Re:Ummm.. by Fweeky · · Score: 3, Interesting

    That allows ancient applications like MUTT to still synchronize on various files Mutt uses atime to cheaply determine if there are new files in a mbox, without needing to keep external metadata or scan for flags; if atime < mtime, put a 'N' in the index. Given atime's required by POSIX, and it's a non-critical feature I don't really see why this is a problem; it's not like it's the end of the world if it doesn't work. Maybe mutt could be explicit about wanting atime updated using utime(), or you could add a folder-hook to touch -a a mbox on open, assuming that still works on noatime mounts.

    It's also commonly used by cleanup scripts to delete rarely accessed temp files and such. I'm sure there are plenty of scripts in production dotted around doing things like find -atime +7 -delete, some probably quite recent.

    Of course, I have to question why they're still using something as ancient as MUTT Err, because it's a nice mail client? Do you also want to question why we're still using bourne compatible shells, vi derivatives, or Perl? Also, it's spelled "mutt" or "Mutt", not "MUTT". Anyone would think you were trolling...
  18. Re:Ummm.. by myowntrueself · · Score: 3, Interesting

    or backups (if a file has not been read in three years, it's probably safe to archive and move off the drive)

    Ummmm... if you are doing regular backups then the atime will be changing every time so you can't tell if the file has not *really* been read in three years.

    So atime is only useful in the case that you havn't backed that file up in three years and it has never even been accessed in that time... at which point I have to ask why you are only now deciding to archive it to offline media, if its relevence is really that low?

    --
    In the free world the media isn't government run; the government is media run.
  19. Re:latest relatime patch by Anonymous Coward · · Score: 1, Interesting

    I've read the TFA and IMHO it seems that this could be improved without sacking the concept of atime.

    1. First, we need a common journal, listing all files currently open (or at least open for read access).
    2. periodic atime update is done ONLY for said journal, not for individual files.
    3. We write down individual atimes for those open files at the time they are closed, or on recovery (if system goes belly up before closing them), from said "open files" journal's atime, which was updated regularly.
    4. For all files that are currently (perhaps permanently ... or just very long time) open, atime request returns current system time.
    (5. ???
    6. Profit!)

    For most TFA-mentioned purpose of atime, proposed hack could do just fine. However, here lies a hypothetical problem:

    If two processes synchronize on a file, so that one process writes information the other reads, then "writer" may rely on atime to find out when this information was "consumed" by "reader" (when exactly last actual read() was called). If both "reader" and "writer" keep their file descriptors open throughout their run times, then "writer" will be each time (mis)informed that last piece of information written was readily read, whenever it fetches atime of said file.

    Perhaps such specific usage should be something of a special feature, a flag, set if needed by "reader" at open() request?

  20. Re:The units in seconds are a bug of Unix. by Guy+Harris · · Score: 2, Interesting

    Nothing about POSIX requires that it be impossible to get higher-resolution time stamps. You just can't get them with POSIX-only code.

    For instance, OS X (and possibly other BSD-flavored UN*Xes) either defines the time stamps, in "struct stat", as struct timespec st_[amc]timespec; and #defines st_[amc]time as st_[amc]timespec.tv_sec, or puts a long st_[amc]timensec; field after each time_t st_[amc]time; field, depending on whether _POSIX_C_SOURCE is defined - POSIX-only code will work, and non-POSIX-only code can get at the higher-resolution times.

    And the POSIX requirement for st_atime to be updated has nothing to do with the issue you're complaining about in any case.

  21. Re:Ummm.. by siride · · Score: 2, Interesting

    You could use a bind mount on the mail directory with atime enabled.

    mount -o atime --bind /path/to/maildir /path/to/maildir

  22. Great for porn by Anonymous Coward · · Score: 1, Interesting

    I use it to help with security checks (even though you can hack the atime record), to help debug (what input file caused it to hang?), and all sorts of things where it's useful to know when a file was read.

    I don't care about debugging and stuff, but it's really useful for my porn collection. If you have a big porn folder, just sort it by atime and you can see something "fresh" every day.
  23. Is this just a Linux issue? by Kadin2048 · · Score: 3, Interesting

    Is this something that's limited just to Linux and the ext3 filesystem?

    I'm particularly curious as to whether it's an issue on Mac OS X with the HFS filesystem also, and whether it would be possible / advisable to force Mac OS X to mount the root HFS partition as noatime/nodiratime.

    OS X doesn't use a traditional UNIX-style fstab, so it's not immediately clear to me how you'd change the mount options (last time I checked disk mounting was all just in /etc/rc, but perhaps it's been moved into that new SystemStarter business since I last checked). It seems like the same things ought to apply to HFS -- it has an attribute that's functionally identical (at least, I think it is -- feel free to correct me on this) to atime, stored in the catalog file -- but I'm not familiar enough with the workings of the filesystems to know if that's actually the case.

    If this doesn't occur in other OSes (I picked OS X because it's the other OS that I use frequently, and it uses a default fileystem that's pretty different in design from ext2/3), it seems like it might be worthwhile to look at why that is, and what tradeoffs other OSes have made to avoid the same issue.

    --
    "Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
  24. Re:latest relatime patch by 12357bd · · Score: 2, Interesting

    And ... it's also kind of ironic that this relatively small patch often brings more practical benefits to the desktop than all the "big" desktop interactivity/latency features (cfs, swap-prefetch, -rt kernel) combined.

    Talking a bout desktop interactivity, it seems you missed to mention Con's scheduler in gamming scenarios, curious eh?.

    --
    What's in a sig?
  25. spin-up by hand? by Joseph_Daniel_Zukige · · Score: 2, Interesting

    Well, not exactly, but I got a used notebook drive. A year after I started using that notebook as my home web server, heat and centrifugal force and such seems to have spun the grease away from the contact surfaces of the bearings. Or, the disks may have been sleeping and restarting every ten minutes to run the dynamic dns client. (I really need to figure out how to keep that script and the perl it uses in a RAM disk or something.)

    Anyway, it started humming from the re-seeks caused by disks that couldn't maintain speed, and then eventually the disk froze.

    Thought I had lost the data.

    But I thought twice about it. If I just trashed the drive, the data was gone. I couldn't afford to send it in to a professional recovery service, and I did have backups that were sort of recent, anyway. And I wanted to show the insides of the drive to my son.

    So I opened the enclosure, showed it to my son but didn't let him touch it, rotated the disks by hand (very carefully avoiding touching or letting dust fall on the disk surfaces), closed it up, plugged it into a USB shirt-pocket enclosure, and pulled off my data.

    Turns out I can still use the drive to carry unimportant data around. (Very light use.) I don't trust critical data on it, of course.

    joudanzuki