A Short History of Btrfs
diegocgteleline.es writes "Valerie Aurora, a Linux file system developer and ex-ZFS designer, has posted an article with great insight on how Btrfs, the file system that will replace Ext4, was created and how it works. Quoting: 'When it comes to file systems, it's hard to tell truth from rumor from vile slander: the code is so complex, the personalities are so exaggerated, and the users are so angry when they lose their data. You can't even settle things with a battle of the benchmarks: file system workloads vary so wildly that you can make a plausible argument for why any benchmark is either totally irrelevant or crucially important. ... we'll take a behind-the-scenes look at the design and development of Btrfs on many levels — technical, political, personal — and trace it from its origins at a workshop to its current position as Linus's root file system.'"
This looks like a promising filesystem - as ZFS on linux is, at present, doomed to die an ugly death, btrfs looks to address a lot of the shortcomings of other filesystems and bring a clean, modern fs to linux. It goes beyond ZFS in some areas too, such as being able to efficiently shrink a filesystem, and keeps a lot of the cool things that ZFS made popular, such as Copy-On-Write.
It looks like Btrfs also addresses some decisions that were made with the direction that ZFS would be going in, or how it would handle certain problems that now with hindsight behind the developers, they possibly would have done things differently.
Apple are really struggling with ZFS, with it being announced as a feature in early betas of both Leopard (10.5) and Snow Leopard (10.6), as well as being there in a very limited form in Tiger (10.4) - maybe development on Btrfs will leapfrog ZFS for consumer-grade hardware and Apple can finally look at deprecating HFS.
Specialist Mac support for creative pros, Melbourne
Aside from Copy on Write, one other feature that this filesystem has that I would consider essential in a modern filesystem is full checksumming. As drives get larger and larger, the chance of a random undetected error on write increases and having full checksums on every block of data that gets written to the drive means that when something is written, I know it's written. It also means that when I read something back from the disk, I know that it was the data that was put there and didn't get silently corrupted by the [sata controller | dodgy cable | cosmic rays] on the way to the disk and back.
Specialist Mac support for creative pros, Melbourne
Snapshots are nice too. Makes stuff like Time Machine and derivatives much more elegant. ZFS has built in RAID support (which, I assume, works on the block level, instead of on the disk level), maybe Btrfs will get this too.
Is it Beta? The fact that Linus runs it as his root fs doesn't tell me much. Now, if you told me that's what he uses for ~/, I would be more impressed.
The important question to me is, how long 'til it gets in the major distributions?
Help! I'm a slashdot refugee.
ZFS has built in RAID support (which, I assume, works on the block level, instead of on the disk level), maybe Btrfs will get this too.
Yes, btrfs currently has built-in support for raid 0/1/10, 5 and 6 are under development.
As if fsck wasn't bad enough to use in business talks, now I have to get prepared for btrfsck
What you do know is that when you read a block of data back from the disk, that block is what was supposed to be written to the disk.
If a file that is never read is corrupted somehow, then you will only discover that corruption when you read the file.
Having checksums is very good if you have a RAID-1 mirror. With full block checksums, you can read each half of the mirror and if there is an error, you know which one is correct, and which one isn't. At present, if a RAID-1 mirror has a soft error like this, due to corruption, you don't know which half of the mirror is actually correct.
With ZFS, for instance, you can create a 2-disk RAID-1 mirror and then use dd to write zeroes to one half of the mirror, at the raw device level (ie, bypassing the filesystem layer) and when you go to read that data back from the mirror, ZFS knows that it's invalid and instead uses the other side of the mirror. It then has an option to resilver the mirror and write the valid data back to the broken half, if you so want.
Specialist Mac support for creative pros, Melbourne
Who cares? In a few years' time, this will be obsoleted by its successor, icantbelieveitsnotbtrfs.
Wow. FUD flies fast and hard on slashdot. Zealots? Are you serious? Rather than mod your post as +1 Funny, I think I'll blow some karma and respond, just to set the record straight.
Laying aside misconceptions about the GPL, the main reason BtrFS is GPL is because it's part of the Linux kernel which is also GPL! How hard is it to grasp that? If Apple or anyone else wants to license Oracle's BtrFS code, they are welcome to negotiate and get the code under a different license than the GPL. It's that simple. BtrFS is an implementation of an idea, a specification. If Apple wants to write their own BtrFS driver, they are welcome to do that. Or Microsoft.
Why are developers who don't want their code to be ripped off (used without payment in a closed product) by companies and incorporated into a product are labeled zealots? How is this different than software companies requiring code to be licensed by third parties? So a company who creates some really cool technology that they license for a fee to others for use in products zealots? There really is no difference.
While I haven't written any software of note, I also use the GPLv2 (evaluating v3) since I want my software to be able to be freely used by those that want to use it, but if my code is that valuable to a company, I want to get paid for my trouble. If no one is willing to pay me, then that's fine. They are welcome to use my software without restriction, but if they redistribute it, to do so under the terms of the GPL. Guess that makes me a zealot.