Mac OS X 10.3 Defrags Automatically
EverLurking writes "There is a very interesting discussion over at Ars' Mac Forum about how Mac OS X 10.3 has implemented an on-the-fly defragmentation scheme for files on the hard drive. Apparently it uses a method known as 'Hot-File-Adaptive-Clustering' to consolidate fragmented files that are under 20 MB in size as they are accessed. Source code from the Davwin 7.0 Kernel is cited as proof that this is happening."
Obviously doing this process slows down file access a little. I wonder whether any safeguards are in place, such as turning the system off after a certain I/O load is reached? If not, this may not be such a good idea.
Also, I wonder whether if you were to calculate the extra time (perhaps 500ms) to defragment each fragmented 20MB file against doing a manual defrag every month, and whether it's actually worth it...
Don't some Linux filesystems already do this to some extent? I could be hallucinating again, but I'm sure I read this somewhere.
ahh shucks.
"Slashdot, where telling the truth is overrated but lying is insightful."
If Apple got rid of HFS+ they would need to replace it with something else. No other filesystem supports FileIDs for example.
Time for HFS++
>80 column hard wrapped e-mail is not a sign of intelligent
>life
Me, I'd take the comparatively modern HFS+. I'm still confused as to why metadata isn't being taken seriously by the rest of the computing world.
You are not alone. This is not normal. None of this is normal.
Umm, isn't the HD the bottleneck of modern systems?
Isn't the reason all these high performance machines have so much RAM so that
they don't have to take the enormous hit of swapping to disk?
Even ram is too slow. That's why they're putting so much cache on the chips now.
*sigh* back to work...
In my day, we'd crack open the drive on our Mac SE30s, sharpen a magnet on a whetstone, and defrag that sucker by hand.
Kids these days. It's the MTV, ya know - makes 'em lazy.
MacOS FileID's?
Are they comparible to what Reiser4FS will have? Are they better that XYZ offering in Linux?
I'm seriously interested in what EXACTLY they are. Please spare the fanboy attitude if you do wish to answer..
But instead of defraging FFS just makes better decisions as to where to put files....Seriously, if Apple got rid of HFS, none of this technology would be necessary.
You speak as though HFS+ has trouble with file fragmention. It's easily already one of the best filesystems for avoiding fragmentation - I've worked on Macs that have been run for years without attention and were better than 90% unfragmented. This is considerably better than any of the Microsoft filesystems, for instance. This tweak is an improvement, to get from 90% to 99%.
HFS+ doesn't just put the files down randomly, either, it has some smarts.
This also explains why the hard drive on my iBook seems alot hotter since upgrading.
The only way this feature can do that is if you're writing small files continuously. That's very strange software behavior, and perhaps a worst case for this optimizer. Why would you be doing that?
Don't get me wrong, HFS+ isn't the best filesystem ever created, but it's very featureful (multiple forks, file ID's, case-preserving, case-insensitive-possible, UNICODE, attributes, 64-bit file sizes, POSIX compliance, etc.) and the MacOS relies on it heavily. Anything to replace it would be a superset of HFS+. Fortunately, Apple hired the guy behind the Be Filesystem a few years back. I doubt he's working on iMovie 3.1.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
According to the ARS writeup, this feature is on only when journalling is on. This makes total sense, since journalling prevents an incomplete or unverified write from being used.
Some drink at the fountain of knowledge. Others just gargle.
If you refer to a file by an inode you are basically creating a hard link so if the file is deleted the file still 'exists'.
Also you cannot get a file path from an inode thus if the file path is changed (moving a file for example) the application cannot know what the new path is.
A FileID is really more equivalent to a path, or rather used in place of a path with the advantage that the path can change and the fileID remains the same. Thus referring to a FileID is less fragile.
Also FileIDs are smaller so searching for files using a FolderID or FileID is faster and uses less memory.
They're not equivalent.
>80 column hard wrapped e-mail is not a sign of intelligent
>life
Well that's fine. The real upside of this is for people that have never heard of /. and don't really know what a hard drive is, let alone know how to defrag one.
Previously these people would just go forever without defragging. Now they can still do that, because Apple is doing it for them behind the scenes.
This is yet one more example of Apple's winning philosophy: Keep it simple, make it better.
A Multiplayer Strategy Game for Mac OS X, Windows, and Linux
Its a just recently added feature!
See the -s option for newfs_hfs:
(from man newfs_hfs)
-s
Creates a case-sensitive HFS Plus filesystem. By default a case-insensitive filesystem is created. Case-sensitive HFS Plus file systems require a Mac OS X version of 10.3 (Darwin 7.0) or later
"I'm a Genius!"*
*Not an actual Genius
That's gonna mess up my UT 2003 ranking! I work hard for those frags! Every one of them!
There exists no way of exchanging information without making judgments. --Bene Gesserit Axiom
Windows XP has a similar feature that waits a until the computer is not in use for a certain amount of time. It would make sense that Apple would give users the same option.
I think I think, therefore I think I am.
Try this...
...and this...
The source code is posted to that thread; the only conditions are (1) 3 minutes after the system time starts (i.e. avoid doing so when booting up), (2) less than 20 MB of size, (3) file isn't already opened.
The only negative consequence is a possible speed hit, though. There's no danger.
I'm pretty impressed by this. Sure, it's been done before. Sure, there are more elaborate methods. But this is just a simple little lump of code that'll defragment the worst files most of the time.
this should defrag all of the 20M or less files on your hard drive.
it locates every file, opens it and reads every bite then closes it.
This should force the defragger to run on all files under 20M. Not that technically the defragger only activates when the file is broken into more that 8 extent regions. So this does not actually defrag everything.
but its also possible that having the file broken into less extents is harmless. first because the the first 8 extents are the fasttes to access in HFS+ and second its theoretically possible that on a multi-headed disk drive that having the file slightly fragmented might be good. Larger numbers of frags than read head would be bad of course
Some drink at the fountain of knowledge. Others just gargle.
POSIX compliance
Doesn't being case insensitive violate POSIX? Or has that been fixed?
yes
Don't call me back. Give me a call back. Bye. So yeah. But bye our, well, but alright we are on a shirt this chill.
I'm Not sure the windows approach is really better. Notice that the apple approach is more minimalist in moving files.
Some drink at the fountain of knowledge. Others just gargle.
Because the rest of the computing world is more interested in successfully interacting with itself and has realised that filesystem metadata is practically impossible to successfully move between systems using "common" tools.
Apple has finally figured this out, too, which is why they're moving away from it.
Filesystem metadata is another one of those cool ideas that is (or, rather, was) kneecapped because of lowest-common-denominator restrictions.
Fragmentation is a very real problem for people who need lots of contiguous free space, especially those working with multitrack audio and video files. They can't have drive heads searching around a drive for free blocks of space when they could be writing linearly.
Even with this file defragmenter built-in, a drive defragmenter is still needed for certain types of users.
Interoperability.
By the logic you seem to be applying, every time a file is accessed you "risk" corrupting it.
It does not matter if you double check what you wrote, because that only decreases the chance of making an error. it does not eliminate it. you might make the same read error twice in a row.(e.g. to make this plausible imagine say a weak magnetization that flips after a temperature change later that night). or perhaps you may have read the file wrong in the first place. or something might go wrong when you are updating the file allocation table or there might be some bug in the code that makes an error.
Please explain why these events are more likely to occur to a file's new location than the old one.
the point is that if you just left the file alone inthe first place its safer.
Why ? It's equally as vulnerable to all the same one-in-a-billion events.
thus the apple approach of only defragging a file in active use and leaving all the other's alone may be preferred to a blanket de-fraggmenting of a disk.
So, assuming you're correct, the Apple method is preferred because it (minutely) increases the chances of corrupting the files you access, as opposed to some random file ?
Tell me, which file do you think you're more likely to care about - a random file X from the set of all files on the drive or a random file Y from the set of files you deliberately access ?
furthermore, when was the last time you tried de-fraggmenting a 500GB disk? do you have any clue how many days this would take??? The apple approach of doing it just-in-time makes a lot of sense. It follows the same logic as jounaling, which is in part a response to no having to fsck a 500GB disk.
After many years of experimenting with defragging and not defragging, I've come to the conclusion that it makes bugger all difference whether a disk is fragmented or not. If I can't tell the difference, it's not worth doing.
The summary appears to not be quite right.
To clarify, there are 2 separate file optimizations going on here.
The first is automatic file defragmentation. When a file is opened, if it is highly fragmented (8+ fragments) and under 20MB in size, it is defragmented. This works by just moving the file to a new, arbitrary, location. This only happens on Journaled HFS+ volumes.
The second is the "Adaptive Hot File Clustering". Over a period of days, the OS keeps track of files that are read frequently - these are files under 10MB, and which are never written to. At the end of each tracking cycle, the "hottest" files (the files that have been read the most times) are moved to a "hotband" on the disk - this is a part of the disk which is particularly fast given the physical disk characteristics (currently sized at 5MB per GB). "Cold" files are evicted to make room. As a side effect of being moved into the hotband, files are defragmented. Currently, AHFC only works on the boot volume, and only for Journaled HFS+ volumes over 10GB.
...was meant as humor! Obviously increasing background disk activity is gonna suck juice outta yer battery like a Whitehouse intern...
"Flyin' in just a sweet place,
Never been known to fail..."
I believe the rationale is that it takes little more than the same number of IOs to defrag as it is going to take to read the file once, and will take less IOs on subsequent accesses to the file (after defrag), which would appear to be imminent because the file has just been opened.
cat
I wonder how that interacts with the "secure" delete. Does it seeks out previous copies of the file and securely delete them too? That would be quite a feat.
(Also, has anyone confirmed that the code snippet is actually executed?)