Linux Gains Lossless File System

← Back to Stories (view on slashdot.org)

Linux Gains Lossless File System

Posted by CmdrTaco on Tuesday October 4, 2005 @03:05AM from the i-was-sick-of-my-old-lossyfs-anyway dept.

Anonymous Coward writes "An R&D affiliate of the world's largest telephone company has achieved a stable release of a new Linux file system said to improve reliability over conventional Linux file systems, and offer performance advantages over Solaris's UFS file system. NILFS 1.0 (new implementation of a log-structured file system) is available now from NTT Labs (Nippon Telegraph and Telephone's Cyber Space Laboratories)."

8 of 331 comments (clear)

Min score:

Reason:

Sort:

Stable? by theJML · 2005-10-04 03:11 · Score: 5, Informative

I like how they say it's reached a stable release but if you look at the known bugs on the Project Home Page http://www.nilfs.org/ You'll see that:

The system might hang under heavy load.
The system hangs on a disk full condition.
Aren't those kind of important to saying that something is stable?

--
-=JML=-
Here's an overview for lazy people like me by Work+Account · 2005-10-04 03:12 · Score: 3, Informative

NILFS is a log-structured file system developed for the Linux kernel 2.6. NILFS is an abbreviation of the New Implementation of a Log-structured File System. A log-structured file system has the characteristic that all file system data including metadata is written in a log-like format. Data is never overwritten, only appended in this file system. This greatly improves performance because there is little overhead regarding disk seeks. NILFS also has the following specific features:

* Slick snapshots.
* B-tree based file and inode management.
* Immediate recovery after system crash.
* 64-bit data structures; support many files, large files and disks.
* Loadable kernel module; no recompilation of the kernel is required.

--

If you "get" pointers add me as a friend (116)!
actual info about the fs by cowens · 2005-10-04 03:16 · Score: 4, Informative

http://www.nilfs.org/
Re:Horrible headline by TheRaven64 · 2005-10-04 03:23 · Score: 5, Informative

The title was written by a numpty. This is a log-structured filesystem. These systems have been around for ages. NetBSD has LFS (originally from 4.4BSD), and I believe Minix also had some form of log-structured filesystem.
A log-structured filesystem doesn't modify existing files. Every time you write to the disk, you simply append some deltas. This gives very good write performance, but poor read performance (since almost all files will be fragmented, and the entire log for that file must be replayed to determine the current state of the file). To help alleviate this, most undergo a vacuuming process[1], whereby the log is replayed, and a set of contiguous files is written. This also frees space - something that is not normally done since deleting a file is done simply by writing something at the end of the log saying it was deleted. In addition to the good write performance, log-structured filesystems also have an intrinsic undo facility - you can always revert to an earlier disk state, up until the last time the drive was vacuumed.
The snapshot facility is not particularly impressive. It's a feature intrinsic to log-structured filesystems, and also available in other filesystems (such as UFS2 on FreeBSD and XFS on Linux). The performance advantage claims must be taken with a grain of salt - write performance for log-structured filesystems is always close to the theoretical maximum of the disk, but this is at the expense of some disk space, and read speed (although LFS did beat UFS in several tests on NetBSD).
[1] This is usually done in the background when there is little or no disk activity.

--
I am TheRaven on Soylent News
Re:Bloat? by ivan256 · 2005-10-04 03:25 · Score: 5, Informative

I wrote a (unfortunatly, closed source) filesystem that was remarkably similar to this once. Generally these types of filesystems are used when you're constantly writing new data. You're going to be eating the space anyway, but you want the reliability of syncronous writes with the performance of asyncronous cached writes. Reading from these filesystems is incredibly slow in comparison.

The version I wrote took advantage of the client's bursty IO pattern and used the slow periods to offload the data to an ext2 filesystem on a seperate disk. Hopefully your system memory was large enough that the offload to the secondary filesystem happened without any disk reads. Once that was done, the older sections of log could be re-used.... But only once the disk filled up and wrapped back to the beginning, because you want to keep your writes (essentially... There's other timing tricks you can play to get more speed) sequential.

There's been lots of research done on this method of write structuring. Look for papers on the "TRAIL" project (also closed source), for example.
Re:There's no replacement for ext3fs yet for me... by m50d · 2005-10-04 03:38 · Score: 3, Informative

Reiser3: How's the quota support, still have to patch kernel everytime? Plus it doesn't have ACL.
It does have ACL, and quota support is fine at least in gentoo kernels (can't check a vanilla one atm)

--
I am trolling
Re:New Improved? by Feyr · 2005-10-04 03:54 · Score: 4, Informative

the why is dependent on your application,

for common servers, or day-to-day use. it isn't

but notice how this was developped by a telecom company? a log structured filesystem is perfect or even required, due to speed and integrity constraints (depending on the size of the network), when you're dealing with billing and monitoring data on a telecom network. you want something that's simple and extremely resistant to failures. a complete system crash (which never happen, short of nuking the box) should not result in any data loss, or the extreme minimum, and you should be able to recreate that data from somewhere else (eg, the other endpoint in a telephone network).

a log structured filesystem allow this, the "head" is never over previous data in normal operation. you don't typically read the data back until the end of a cycle (whatever that cycle may be) or in a debugging condition. you simply append to the end. minimizing head movement, and thus increasing mtbf (replacing a disk in those things is costly)

this is also extremely useful for logging to WORM media (write once, read many), for security logs mostly. you don't want a hacker to be able to remove them, no matter what they do
Re:Shutdown versus power off by 0xABADC0DA · 2005-10-04 05:29 · Score: 3, Informative

Close, but no cigar:

1. It goes into the OS filesystem cache. After 5 seconds the modified data gets flushed to the disk (sometimes set to 30 sec).
2. It is written to the hard drive. Here, it sits in the hard drive controller's on-board cache until the head arrives at the write point, which is a fraction of a second.
3. It is written to disk.

So it *can* happen that data is not written properly, but unlike the scary picture you paint it is extremely unlikely. Even if you just saved your data, just do a sync and you'll be fine turning the power off.