XFS merged in Linux 2.5

New file system by Gabrill · 2002-09-17 03:29 · Score: 4, Funny

The round file gets all my bills. The manila one gets all my pay stubs. It works out ok.

--
Always going forward, 'cause we can't find reverse.

Wow by GigsVT · 2002-09-17 03:30 · Score: 1

This is pretty unexpected, at least for me. I know they had cleaned out the VM changes and things of that sort recently, but this is really something that I did not expect to see so soon.

--
I've had enough abrasive sigs. Kittens are cute and fuzzy.

Re:Wow by Sturm · 2002-09-17 03:37 · Score: 2

Why? Gentoo has been using the XFS patches for a while now and it seems really stable. And I know this is more of a perception thing, but it seems much faster than ext3fs, for me, on the same hardware.
Re:Wow by GigsVT · 2002-09-17 03:40 · Score: 2, Informative

Well because it used to make extensive VM changes to the kernel, which was keeping it out for so long. The way I understand it, it was a direct port of the IRIX XFS mostly, and thus also had some IRIXisms that could cause problems on Linux. I read a recent post to lkml that indicated they had cleaned it of VM changes, but I really didn't expect a merge so soon.

--
I've had enough abrasive sigs. Kittens are cute and fuzzy.
Re:Wow by be-fan · 2002-09-17 04:10 · Score: 2

Actually, if you had been keeping up, some guy had broken XFS up into nice tiny patches, while also converting it to use the generic I/O routines. In the end, there were four lines of changes to non-XFS code. There is an OS news article about this, with a link to the relavent lkml thread.

--
A deep unwavering belief is a sure sign you're missing something...
Re:Wow by klieber · 2002-09-17 04:22 · Score: 1

Why? Gentoo has been using the XFS patches for a while now and it seems really stable.

Correction -- Gentoo was using the XFS patches for a while. They were pulled out in r9 of gentoo-sources due to corruption and stability problems. (might have been r8, now that I think of it)

--kurt

--
Gentoo Linux http://gentoo.org/
Re:Wow by GigsVT · 2002-09-17 05:09 · Score: 1

I saw that... That's what I mean about getting rid of VM changes. Some kernel people I talked to on IRC still seemed to have "reservations" to put it lightly.

--
I've had enough abrasive sigs. Kittens are cute and fuzzy.
Re:Wow by broody · 2002-09-17 06:17 · Score: 1

Your drifting into FUD with that one. It is not any more legitimate to blame the XFS developers for stabillity problems caused by VFS code they might not have even seen than when people blamed ReiserFS or Ext3 for the same issue. If you look closely r7 versus r8 or r9 are different beasties.

Many people are quite happy with the stabillity of XFS on linux. That said Daniel Robbins (surely a big name at Gentoo) is not one of them. His issue with XFS is focused on problems with XFS if your system reboots while completely overwriting files. On the whole, he thinks ReiserFS is more stable but obviously YMMV.

Wake me up when you've found the perfect file system.

--
~~ What's stopping you?
Re:Wow by dcstimm · 2002-09-17 06:23 · Score: 2, Informative

the reason they pulled it is because of gentoo ppc. Very unstable on ppc.

--
keanmarine.com
Re:Wow by mrjohnson · 2002-09-17 06:33 · Score: 1

Hey, I'm a big fan of XFS but I have had file system corruption using it. Postgres up and died while the system was hitting the disk pretty hard. All the processes accessing the data directory up and stalled.

It was pretty bad, xfs_repair worked fine but I had a lot of orphaned pgsql files in lost+found. I've since recompiled my kernel with the hella old egcs version they recommended (had to compile and patch that myself, too, since debian doesn't have it anymore).

It's been working fine with the new kernel, but any time that server hiccups I'm scared to death it's going to get corrupt again....
Re:Wow by broody · 2002-09-17 19:47 · Score: 1

I'm not saying any filesystem is perfect.

Luckily I have had no significant problems with ReiserFS or XFS. I had more time with the former as I was using SuSE for quite awhile. After switching my Linux istalls to Gentoo, I'm all XFS.

Nothing is a substitute for a backup.

--
~~ What's stopping you?

Comparison? by FyRE666 · 2002-09-17 03:32 · Score: 3, Interesting

Does anyone have a link to any comparisons of all these journaling filesystems, showing their strengths and weaknesses? Why shouldn't I just stick with ext3 for everything?

--
Code, Hardware, stuff like that.

Re:Comparison? by Wee · 2002-09-17 03:39 · Score: 3, Informative

Does anyone have a link to any comparisons of all these journaling filesystems, showing their strengths and weaknesses?
Google is always your friend.
-B

--
Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.
Re:Comparison? by rindeee · 2002-09-17 03:40 · Score: 5, Informative

http://aurora.zemris.fer.hr/filesystems/
Re:Comparison? by einhverfr · 2002-09-17 03:41 · Score: 2

I have seen comparisons of ext3 and Raiserfs but these have not included other jfs's.

Basically I still see ext3 as much more full-featured than Raiserfs (supports file attributes which can be useful in many places on the system) but Raiserfs is faster (esp. for small files), so if you have databases, or are using your filesystem as a hierarchical database, maybe Raiser is for you.

Now how does XFS compare to these two?

--

LedgerSMB: Open source Accounting/ERP
Re:Comparison? by Anonymous Coward · 2002-09-17 03:42 · Score: 0

Play my free Pacman clone on your java phone! [javascript-games.org]

Java != Javascript
Re:Comparison? by rindeee · 2002-09-17 03:43 · Score: 2, Informative

And this: http://oss.software.ibm.com/developer/opensource/j fs/project/pub/jfs040802.pdf
Re:Comparison? by a_timid_mouse · 2002-09-17 03:43 · Score: 1

In addition to journaling, I believe XFS also allows you to create logical volumes from multiple disks with disk striping.
Re:Comparison? by GigsVT · 2002-09-17 03:48 · Score: 1

I'm think that you are thinking of XLV and not XFS.

--
I've had enough abrasive sigs. Kittens are cute and fuzzy.
Re:Comparison? by MasterD · 2002-09-17 03:53 · Score: 1

XFS has write speads in comparison with *ext2*. Ext3 writes are almost twice as slow as ext2 because of the journalling overhead. Ext3 was an addon to ext2 whereas XFS had journalling built into the design from the beginning.
Re:Comparison? by auferstehung · 2002-09-17 04:00 · Score: 5, Informative

You could check out Daniel Robbins' "Advanced filesystem implementor's guide" over on IBM's developerworks. He covers reiserfs, ext3, and XFS and I believe there is a link to articles on JFS in the Resources section at the bottom of the page.

--
Logic is not Divine.
Re:Comparison? by a_timid_mouse · 2002-09-17 04:00 · Score: 1

Yeah, you're right. Thanks for clearing out the cobwebs. You CAN run XFS with LVM though, and I think that accomplishes the same thing. I don't have any experience in that area though.
Re:Comparison? by dmelomed · 2002-09-17 04:10 · Score: 1

When people say slower, they forget these can be dealt with a RAID controller with battery-backed write-back cache. These have been getting cheaper and cheaper. It's like mounting your FS async with all the performance advantages and no fear of losing data at the time of a crash.
Re:Comparison? by Anonymous Coward · 2002-09-17 04:41 · Score: 0

Is there a particular reason you people can't be bothered to put a proper link in?
http://aurora.zemris.fer.hr/filesystems/
Re:Comparison? by Anonymous Coward · 2002-09-17 06:12 · Score: 0

A technical overview/comparison of the various flavors of journaling filesystems:
http://linuxgazette.com/issue55/flor ido.html

It's especially interesting if you have no idea what journaling filesystems are all about. ;)
Re:Comparison? by dcstimm · 2002-09-17 06:29 · Score: 1

that was the guide that made me realize how smart drobbins is. And after reading most of his stuff I switched my distro to gentoo.

--
keanmarine.com
Re:Comparison? by legis · 2002-09-17 10:53 · Score: 1

There is a good series of articles at IBM located at:

http://www-106.ibm.com/developerworks/library/l- fs .html

The links to the other articles are in the resources section with the exception of part 7 which is located at:

http://www-106.ibm.com/developerworks/linux/libr ar y/l-fs7/
Re:Comparison? by Anonymous Coward · 2002-09-18 08:01 · Score: 0

If you have a choice between adding some free software to your system, and adding a new piece of hardware with its own separate memory, to get the same functionality, which one would you go for? Which one would any non-idiot go for?

Cool by mortis_aeturnus · 2002-09-17 03:32 · Score: 2, Interesting

From linux-2.6 on I don't have to repatch the kernel source with that sgi.com XFS patch everytime a new kernel comes out. BTW, I still have trouble getting XFS to work on linux-2.4.19 because sgi won't update their stable XFS patch from 2.4.18.

Re:Cool by Anonymous+Conrad · 2002-09-17 03:41 · Score: 1

BTW, I still have trouble getting XFS to work on linux-2.4.19 because sgi won't update their stable XFS patch from 2.4.18.

... so why don't you update it yourself?
Re:Cool by jmauro · 2002-09-17 04:01 · Score: 1

Really, what are you talking about? Have you even been to SGI's XFS site? Stop saying you can't use XFS on 2.4.19, because I'm using it right now.
Re:Cool by ShawnX · 2002-09-17 04:17 · Score: 3, Informative

Try my patches at http://xfs.sh0n.net/2.4. They merge in XFS with 2.4.20-pre7 (current) and rmap =)

Shawn.

--
Everyone wants a Tux in their life.

Not just journaling by Anonymous Coward · 2002-09-17 03:34 · Score: 5, Interesting

As I understand it, XFS also offers things like extended attributes. However, I have been told that the Linux VFS does not offer any way to read or write the attribute information?

Is this correct? Will the VFS also be extended so that you can make use of extended attributes in XFS?

Re:Not just journaling by publius · 2002-09-17 03:57 · Score: 5, Interesting

I read them, write them and delete them all the time using the attr family of commands. 64K limitation on the current value size but that's not so bad, and in the future it will be the (I think) 512K that Irix has. When you begin to think of all the cool things you can do with that, it becomes very interesting...
Re:Not just journaling by Anonymous Coward · 2002-09-17 04:02 · Score: 1, Interesting

Yes, attributes are certainly cool. Both BFS and AFS (AtheOS/Syllable) use attributes.

I'm wondering about Linux supporting them (At least nominaly, with XFS) as at the moment, tools that understand attributes are a little thin on the ground. tar, for example, does not understand attributes. Nor do the usual GNU fileutils. Linux support could hopefully mean broader and standardised support in general, which would make our job a little easier.
Re:Not just journaling by IamTheRealMike · 2002-09-17 04:18 · Score: 5, Interesting

Is this correct? Will the VFS also be extended so that you can make use of extended attributes in XFS?
Cooler, if I read the tea leaves right. I believe some time ago now there was a thread on lkml about whether it'd be possible to have files as also directories (and vice-versa). The reasoning behind this was simple: we want flexible filing system attributes, but not at the expense of API bloat. You want ACLs? That'll be another API then. Extended Attributes? Another API. What, you want heirarchical extended attributes too? Well you've just created another version of the filing system API haven't you.
The theory goes (and Hans Reiser, top guy, explains it much better than I can) that by altering one of the rules of the filing system, we can get lots more power and expressiveness without having to invent lots of new APIs. Let's say you want to find out the owner of file foo. You can just read /home/user/foo/owner. You can edit ACLs by doing similar operations. Now you can have something more powerful than extended attributes, but you can also manipulate that data using the standard command line tools too! Coupled with a more powerful version of locate, you can have very interesting searching and indexing facilities.
This has implications beyond just string attributes. Now throw in plugins, so for instance the FS layer interprets JPEGs and adds extra attributes. Now you can read the colour depth of an image by doing "cat photo.jpg/colour_depth" or whatever. You can get the raw, uncompressed version of the file by doing "cp photo.jpg/raw > photo.raw". Noticed something yet? You no longer need a new API for reading JPEG data, because you are reusing the filing system API.
But the FS is not a powerful enough concept, I hear you cry! Have no fear, for with new storage mechanisms comes new syntax too, to allow for BeFS style live queries. If you want more info, you should really read up on this stuff at Reisers site.
That's why ReiserFS is so good at small files as well as large files. Have you ever wondered why that is? It's not just a quirk of its design, it was very deliberate. One day, Hans wants to see us store as much information as possible in a souped up version of the filing system, so reducing interfaces and increasing interconnectedness. Or something. It sounds cool anyway :) That's one thing that RFS has that the other *FSs don't - the ReiserFS team has vision.
Re:Not just journaling by goga · 2002-09-17 04:33 · Score: 2, Interesting

This all sounds very Plan 9-ish. (Not that you can read files as directories in Plan 9.)
Re:Not just journaling by jbolden · 2002-09-17 04:56 · Score: 3, Informative

Apple is designing versions of the tools that support complex attributes for use with the HFS+ filesystem. While the specific issues are slightly different since their code is open sourced no reason it couldn't move over to Linux.
Re:Not just journaling by 1010011010 · 2002-09-17 04:57 · Score: 3, Interesting

How do I use these named streams for a directory? To re-use your example, can I:

$ cat $HOME/owner

and get my username? Or will it be looking for a file named "owner" in $HOME?

--
Napster-to-go says "Fill and refill your compatible MP3 player", which is a lie. It's not MP3. It's WMA with DRM.
Re:Not just journaling by IamTheRealMike · 2002-09-17 05:06 · Score: 2

I oversimplified things a bit. The current syntax has yet to be decided, for for most metadata attributes the current plan (subject to change without notice blah de blah) is to prefix metadata attributes with double dots, so it'd be

$ cat $HOME/..owner

Alternatively, standard UNIX attributes may be placed in a subsubdir, so :

$ cat $HOME/..metadata/owner

Nobodies entirely sure yet.
Re:Not just journaling by Anonymous Coward · 2002-09-17 05:11 · Score: 0

That sounds very cool. I may have to steal that ;)

Seriously, something like this could splat Microsofts DB filesystem concept into the ground.
Re:Not just journaling by decep · 2002-09-17 05:16 · Score: 1

IBM just released 2 patches for the last version of JFS that adds extended attributes and ACLs.
Re:Not just journaling by Anonymous Coward · 2002-09-17 05:57 · Score: 0

Creating tons of API's just to look at attributes seem stupid to me too, but there should also be a way to make benefits from the more advanced features that other filesystems create. An ides could be to add a "multistream" filesystem, where the additional streams could have an "attribute" stream or something, giving almost infinite attribute count. Other filesystems such as NTFS (sorry, have no other examples) implement a similar approach. And to adress a files "named" or "nondefault" stream there cound (as suggested) be adressed as "files within the file".

I like the idea, and have actually thought a little about it before (although giving up as it almost implied I had to implement it. That meaning I would gladly try, but as I'm not so good in programming and are already overworked with other things). This way the advanced features of several filesystems could be taken advantage over.
Re:Not just journaling by Anonymous Coward · 2002-09-17 06:36 · Score: 0

No, it is not correct.

Support for EAs and ACLs were added in 2.5.3.
Re:Not just journaling by oever · 2002-09-17 07:49 · Score: 2

At Micro$oft they've stated that in the future the filesystem will be a database. That sounds like a more generic version of the same idea. The difference is that all filetools will probably have to be adapted to be able to work in a database.

Maybe Reiser's idea is nice compromise.

However, if everything is a database, you can write very generic tools to access and query for data. Maybe the transition would be too big. I'm curious if something like a database instead of a filesystem will every become mainstream reality.

--
DNA is the ultimate spaghetti code.
Re:Not just journaling by Micah · 2002-09-17 07:59 · Score: 2

Now throw in plugins, so for instance the FS layer interprets JPEGs and adds extra attributes. Now you can read the colour depth of an image by doing "cat photo.jpg/colour_depth" or whatever. You can get the raw, uncompressed version of the file by doing "cp photo.jpg/raw > photo.raw".

Yikes! And this belongs in the kernel???

And even if it does belong there, wouldn't it be best in the general FS layer than in a specific FS? It could work across ALL filesystems...
Re:Not just journaling by foobar104 · 2002-09-17 08:56 · Score: 2

Extended attributes aren't as cool as you think. They're stored in the inode, not the file itself, which is a double-edged sword. For example, go build yourself an XFS-based Samba server. Write a file out of Photoshop to the Samba server, something.psd. Add one or more extended attributes to the file using the attr tool on the server. Open the file in Photoshop via Samba, modify it, and save it. Don't "Save as...", just save it. Now examine your extended attributes.

What? They're gone! For great justice!

Photoshop, and lots of other desktop apps, are clever in the way that they save files. Rather than writing over an existing file, Photohop (and others) will create a new file with a temporary name and write to it. Only after the file has been completely written to disk successfully will Photoshop (and others) unlink the original file and rename the new file to the old name. To the user, it looks like you're saving the new file over the old one, but in fact you're creating an entirely new file with-- get this-- a new inode. Bye-bye extended attributes.

So extended attributes are okay and all, but they have serious limitations as well.
Re:Not just journaling by bogado · 2002-09-17 10:15 · Score: 2

No the kernel could have hooks so that userspace apps and/or libraries could do the processing.

--
[]'s Victor Bogado da Silva Lins
^[:wq
Re:Not just journaling by frozencesium · 2002-09-17 13:49 · Score: 1

it also incorperates garunteed rate i/o features which, when incorperated with a decent speed drive, allows you to have a subvolume that has priority on i/o, and can throttle read/writes to the other sections of the disk...perfect for streaming media servers. at least that's what we did when building a few such servers on some dual processor sgi orgin 200's... i love xfs, been using it for a while, so it's nice to see that ability being added to the kernel tree instead of just a patch. -frozen

--
I'm not always the brightest pixel in the stream
Re:Not just journaling by hey · 2002-09-18 03:06 · Score: 1
This is neat as others have said. How about new system calls:
- set_extra(file, stuff_name, stuff, stuff_len)
- get_extra(file, stuff_name, stuff, stuff_size, &stuff_len)
Of course that doesn't handle stuff that's generated from the file like expanding a JPEG.
Re:Not just journaling by IamTheRealMike · 2002-09-23 07:04 · Score: 2

Actually Reisers system is more general. Read his whitepaper to see in depth why this is. Basically imposing a relational database adds unnecessary structure which is bad (again, read his paper to see why). Reisers system is more general as it will allow for set theoretic naming eventually, which is more powerful

IBM jfs allready in 2.4 :-) by Anonymous Coward · 2002-09-17 03:34 · Score: 0

read it today in the 2.4.20pre changlog

http://www.kernel.org/pub/linux/kernel/v2.4/test in g/patch-2.4.20.log

the time moves by

XFS FAQ by semaj · 2002-09-17 03:35 · Score: 5, Informative

There's an XFS FAQ and a load more information about it on SGI's site - which points out that several large distributions have had XFS support for a while by default.

Still, it's noteworthy that Linus has finally accepted it into his tree...

--
Meep meep

Re:XFS FAQ by Anonymous Coward · 2002-09-17 03:44 · Score: 1, Interesting

One of the big advantages is that this means you can now have XFS alongside other major kernel patches, once 2.6 is out. A thread had come up a while back with someone wanting to run XFS with openMosix, this virtually guarantees that it will work in the future.

That's all, just XFS by bsharitt · 2002-09-17 03:38 · Score: 1

I'm still waiting on the original pre-HFS Mac file system to be incorporated.

Re:That's all, just XFS by AJWM · 2002-09-17 03:51 · Score: 2

In Tux's name, why?

(Actually, I'm assuming that was a joke. I do like Linux's HFS support, since I run a mixed platform household.)

--
-- Alastair
Re:That's all, just XFS by fUllstAr · 2002-09-17 03:55 · Score: 0

You probably mean HFS, the precursor of HFS Plus, right?

How can you anyway boot a kernel on the original mac 128/512k, the only ones using AFS?

--
THis is my signature bah: That's ridiculous, someone registered 'fullstar' so I had to choose 'fullstarplus'!!!
Re:That's all, just XFS by bsharitt · 2002-09-17 03:56 · Score: 1

Don't worry it was a joke.
Re:That's all, just XFS by Steffan · 2002-09-17 04:59 · Score: 1

> If you can't beat your computer at chess, try kickboxing.

That sounds funny now, but just wait...

Already supported by major distros by ztc · 2002-09-17 03:38 · Score: 1

Gentoo supports this out-of-the-box (is that an appropriate phrase for gentoo really, heh?.) as well as the other major distros thanks to the aforementioned sgi.com XFS 2.4 kernel patch. It is still great news to see this going into 2.5 -- 2.6 should be an excellent and well-evolved kernel.

Re:Already supported by major distros by uncleFester · 2002-09-17 04:57 · Score: 2

Gentoo supports this out-of-the-box...

No, Gentoo used to support this in the kernel... until 2.4.19-r7. It's been pulled out as of this update. When I asked about this on alt.os.linux.gentoo, I was pointed to this thread on the gentoo-user mailing list. In a nutshell, concerns of data loss when a system powerfails or bad interactions with preempt (which is also included in the Gentoo kernel) are the primary reasons. Luckily, the emerge --update did not toast my old kernel dist folder so i still have it... but you may want to wait.

This pertains to the current stable (1.2?) release.

Perhaps this move in the dev kernel will prompt someone in the Gentoo dist-building realm to re-add this to the kernel.

(i'd post a url to my questions in the ng but i x-no-archived the thread.. silly me)

(and this post is non-pro non-con xfs.. though I've worked with SGI systems, own an Indigo2 and think the fs is pretty solid).

--
-'fester
Re:Already supported by major distros by mmayo · 2002-09-17 10:12 · Score: 1

To be specific, you probably want to read:

http://lists.gentoo.org/pipermail/gentoo-user/20 02 -September/031224.html

This is where Daniel Robbins (Chief Architect, Gentoo Linux) says:

I don't know of anyone on our development team that has a high opinion of XFS. We were all really excited about it at first, but our opinion soured over time as quite a few developers got bitten by the data loss issue.

The filesystem wars continue! Personally, I've been using ext3 on Red Hat production servers and XFS on my Gentoo desktop and haven't had issue with either. Perhaps now that Linus has merged XFS into his kernel, the data loss issues some report with XFS will finally get resolved. The fact that XFS is extent based and filesystems can be arbitrarily grown and shrunk is what attracts me to it. Great for extending a filesystem as you pop more drives into your RAID array..

mhhh .. use cvs, works cool by Anonymous Coward · 2002-09-17 03:38 · Score: 0

i am using their cvs kernel version, which is always an up to date (with org. pre patches and so on).
if you are not using some additional patches, this is the easy way of life ;-)

http://oss.sgi.com/projects/xfs/cvs_download.htm l

Excellent by Ctrl-Z · 2002-09-17 03:40 · Score: 2

This is great; more filesystem support is always good in my opinion. Now if we could just get some stable NTFS read/write support I would be set.

--
www.timcoleman.com is a total waste of your time. Never go there.

Re:Excellent by Anonymous+Conrad · 2002-09-17 03:45 · Score: 1

This is great; more filesystem support is always good in my opinion.

Why?

Haven't we already got one optimised for every eventuality?

Are all the filesystems actively maintained and tested?
Re:Excellent by Anonymous Coward · 2002-09-17 04:02 · Score: 0

Yes, but we don't have one optimized for every user!
Re:Excellent by Anonymous Coward · 2002-09-17 04:14 · Score: 0

Good NTFS support would be a definate plus, would be much easier to convert Windows boxes to linux, small-midrange fileservers and the such.

Hell NTFS a la Win2k/XP is actually a very mature filesystem, I wouldn't have a problem running it instead of reiser or ext3.

We also need good user-mode utils to convert from one to another, resize/split/merge partitions, an Open Source version of Partition Magic, so to speak.

Perhaps one exists, it's impossible to follow every lil source project that gets worked on. Anyone?
Re:Excellent by psamuels · 2002-09-17 04:33 · Score: 2, Informative

Now if we could just get some stable NTFS read/write support I would be set.

It's on the way. Read-only NTFS (rather poor in 2.4) has been rewritten and is much improved in 2.5, and a certain subset of read-write (writing new contents to an existing file) is reported to be stable. I haven't tried it. Full read-write may or may not make 2.6.0 but you can be sure it is in active development.

--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Re:Excellent by 0x0d0a · 2002-09-17 05:08 · Score: 2

Now if we could just get some stable NTFS read/write support...

That will only happen if Microsoft gets a court mandate to open their specifications. MS has far too much economic benefit in deliberately breaking compatibility to not do so. They've changed the ACL portion of the FS in such a way as to break the Linux NTFS driver in every single NT-line kernel release since Linux came out with an NTFS driver.

--
May we never see th
Re:Excellent by Make · 2002-09-17 05:17 · Score: 1

for a free remake of pqmagic, try GNU parted. actually, it doesn't support all the new filesystems, not even reiser (which brings its own tools for that), but it's a good start. works well for me when resizing FAT partitions.
Re:Excellent by psamuels · 2002-09-17 05:30 · Score: 1

Hell NTFS a la Win2k/XP is actually a very mature filesystem, I wouldn't have a problem running it instead of reiser or ext3.

On Linux? I would. The maturity of a filesystem implementation in NT 5 has no correlation to its maturity in Linux. For one thing, it is not based on the same codebase! (Imagine, Microsoft code in the Linux kernel? Actually, yes, but not in ntfs. One phantom mod point for the first poster to know where.) It is not written by the same people. It is not implemented to the same design docs. It is not even implemented to the same specs - the Linux implementation is entirely reverse-engineered.

The only thing in common between the NT 5 version of NTFS and the Linux version of NTFS is the on-disk layout. Which, by the way, isn't even very Unixy. Sure, it has hard links (unlike many legacy filesystems) but I don't think it has symlinks, named pipes, sockets, device specials, or u-g-o file permissions. I'm not sure whether it supports the mtime/atime/ctime tuple, but whatever timestamps it has are probably in 64-bit nanoseconds rather than 32-bit seconds. I think each file object only has a single owner (no "group"), and said owner is expressed as a SID object (128+ bits), rather than a simple 16- or 32-bit integer. Filenames are expressed in little-endian UCS-2 or some such, rather than ASCII or UTF-n. NTFS even supports optional (hey, at least they're optional!) "short filenames" to be DOS-software-compliant, just like VFAT.

Which all goes to say that if/when Linux gets mature NTFS support, the filesystem code will go through a rather complex translation process from on-disk layout to in-core data structures (and vice versa), which cannot be made as efficient as a filesystem designed with Unix semantics in mind. Using NTFS in Linux for anything other than migration, or dual-booting your legacy systems, will never make sense.

We also need good user-mode utils to convert from one to another, resize/split/merge partitions

Agreed. Support in the various fdisk utils for MS Dynamic Disks would be a good start....

--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README

Silly question by Mr_Silver · 2002-09-17 03:41 · Score: 5, Interesting

This is a silly question but ...

When I install Linux, and it comes to anything to do with filesystems, I just go with whatever default it gives me.

I suspect I'm not exactly alone.

So ... what compelling reason is there for me to use any other filesystem? Being more stable or better with data loss is nice, but considering I've only ever had this problem once, doesn't mean that i'll leap up and down going "oo oo! got to have blahFS!" any time soon.

To give you an example, FAT16 to FAT32 was the fact you could have larger partitions. FAT32 to NTFS was because of permissions and security.

But whatever we have now (can't remember, i barely look) to XFS? What *compelling* absolutely-must-have reason do I have to go change from whatever my installer suggests putting on for me?

Or should I just stick with what the installer suggests from now until eternity?

--
Avantslash - View Slashdot cleanly on your mobile phone.

Re:Silly question by kelv · 2002-09-17 03:48 · Score: 2, Interesting

As a desktop user you might be able to get away with any old filesystem......

However, if you have a server that has to have high performance and has data that you *really* care about then one of ReiserFS, XFS, EXT3, etc... becomes a *really* good idea.
Re:Silly question by MasterD · 2002-09-17 03:49 · Score: 5, Informative

XFS supports ACL's (or access control lists) which are much better than standard UNIX permissions.

XFS is an extent based filesystem which means that you don't end up wasting tons of space having to allocate a 4K block for every small file. And you don't need to jump through tons of indirect blocks to get large files.

XFS allocated inodes on the fly so it grows with what data you put on there. Once again, not wasting space up front. And it sticks the inode near the file itself so the head does not have to move far on the hard drive.

XFS supports extended attributes which can be used for all kinds of extensions later on.

XFS has been around since 1994 and is the most mature of the journalling filesystems.

And there are many other reasons that I cannot think of right now.
Re:Silly question by fruey · 2002-09-17 03:50 · Score: 3, Informative

Performance. Different systems are going to take more or less overhead depending on the task. Some daemons might write a lot of data to logs, you want this to be done asynchronously, you may not need the data so badly, you don't need journalling perhaps. (so use ext2??)
Or you have a proxy, you don't care if suddenly your cached data is lost, it will soon be refilled, it's not important data, you want performance without too much security (reiserfs)?
In fact each filesystem has inherent limits on inodes, filenames, permissions, etc... so you go with any that has a minimum for each thing you need. Journalling you don't really need unless you want to be able to step backwards or repair your filesystem in more interesting ways...

--
Conversion Rate Optimisation French / English consultant
Re:Silly question by felicity · 2002-09-17 03:54 · Score: 2, Informative

XFS also allows you to grow the filesystem live (ie: mounted). This is great for those of us who use it in conjunction with a volume manager (I use LVM). lvextend to enlarge the volume, growfs to enlarge the filesystem. No downtime required. :)

It's also a 64-bit filesystem, so you could have extremely large files and filesystems, although my understanding is that the Linux VFS system can't handle the large sizes right now (1Tb max filesystem for instance). XFS is the standard filesystem for SGI's IRIX which doesn't have the restrictions. :)
Re:Silly question by blakestah · 2002-09-17 03:58 · Score: 3, Informative

1) Backup strategies. Versions of dump are available for ext2/ext3 and xfs, but not for ReiserFS (I don't know about JFS). (I don't mean to start a page cache/buffer cache debate).

2) Journalled file systems mean fast re-boots on power outages

3) Speed. This depends on your usage. A huge mail spool machine may use ReiserFS on the mail spool. For most people it is a wash.

4) Ext3 can be remounted as ext2, and really good file system checking tools exist for ext2/3.

Mostly, though, you CAN just stick with whatever the default suggests.
Re:Silly question by Anonymous Coward · 2002-09-17 04:01 · Score: 0

Dump is broken and is no longer supported, even on ext2/ext3.
Re:Silly question by AJWM · 2002-09-17 04:03 · Score: 2

Flexibility, performance, optimization of whatever characteristics you want to optimize.

There may be no compelling reason for you to change from the default (which, presumably, were chosen as defaults becaused they'd satisfy most people). But for someone looking to optimize for a particular application, it's one more variable they can tweak (different filesystems each having their own strengths and weaknesses.)

For example, someone doing desktop video editing (really big contiguous files, high sustained data rate needed, etc) might want a different filesystem than someone running a highly active database server (lots of small table changes scattered across the filesystem).

--
-- Alastair
Re:Silly question by blakestah · 2002-09-17 04:09 · Score: 2

See this post to the dump mailing list
Re:Silly question by mla_anderson · 2002-09-17 04:09 · Score: 1

I went to Ext3 on the desktop for one reason: my kids had the bad habit of hitting the power button on the computer because it made lights blink and noises.

This usually happened in the middle of the day and I would have to walk my wife through fsck over the phone. With Ext3 it just boots back up again...and now that I did it the kids lost interest in the power button

--
Sig is on vacation
Re:Silly question by Anonymous Coward · 2002-09-17 04:10 · Score: 0

> FAT32 to NTFS was because of permissions and
> security.

Not to mention the fact that as the file count increases, so does the size of the file allocation table. NTFS went to the linked method, IIRC.
Re:Silly question by Anonymous Coward · 2002-09-17 04:19 · Score: 0

you obviousely don't run servers. you dual boot w2k and linux so you can be cool. if you ran servers, you'd know that you should use freebsd with soft updates.
Re:Silly question by rseuhs · 2002-09-17 04:20 · Score: 5, Insightful

XFS supports ACL's (or access control lists) which are much better than standard UNIX permissions.
Actually I think ACLs are the reason why everybody is running as Administrator in Windows. They are just too damn complicated.
The Unix-permissions are simple. You can understand the concept of user-group-all in a few minutes and there are only 2 commands to remember (chmod, chown).
Also, Unix-permissions have so far fit with everything I needed and in the rare case you really need something special, there is also sudo.
I think ACLs are only useful for a tiny minority, IMO. I certainly don't need it.
Re:Silly question by iabervon · 2002-09-17 04:25 · Score: 2

If you don't know, the installer had better be suggesting something appropriate, or you're not using a good distribution.

There's no reason to switch from (ext3?) to XFS. But it's quite possible that the next time you install, if you're formatting a new drive, it will suggest XFS. Of course, converting an existing disk is enough of a pain that you probably don't want to do it.
Re:Silly question by Anonymous Coward · 2002-09-17 04:38 · Score: 0

With Ext3 it just boots back up again...and now that I did it the kids lost interest in the power button
I have found that harsh beatings work just as well as a journalling filesystem, in this respect.
Re:Silly question by Yokaze · 2002-09-17 04:59 · Score: 2

> Actually I think ACLs are the reason why everybody is running as Administrator in Windows. They are just too damn complicated.

The reason is, that they're accustomed to the DOS based Windows-Series.
For some people, the concept of a superuser and a normal user seems to be too complicated.

>The Unix-permissions are simple.

Great... Now how does a small group of students get read write rights on a set of files/directories?

>, there is also sudo.

It's just that you are switching into superuser-mode for every little thing a little out of box.
Since you're complaining about people running Windows as Administrator, you certainly are aware of the lack of style in this.
Not to mention, that it is out of question for every larger system (practically every system, which exists outside ones home).

--
"Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
Re:Silly question by Derek+S · 2002-09-17 05:01 · Score: 1

ACLs are most useful to users who are not administrators. For instance, they allow an end-user to grant additional permissions to another department, or a user in another department. This can be accomplished with ugo permissions, but it requires users to submit a request to the IT department (thereby ensuring that it will never get done), and it clutters the group file with lots of random groups (especially bad because you can't nest groups).

This argument, of course, only applies to home directories and common fileshares. I do agree that traditional Unix permissions are much more maintainable on filesystems that house the operating system and applications. As usual, what's good for the sysadmin is often not what's good for the user.
Re:Silly question by Pretender · 2002-09-17 05:40 · Score: 2

I think ACLs are only useful for a tiny minority, IMO.
One thing they are useful for is if you are replacing NT file servers with Samba servers. If you don't use ACLs (either XFS or the EA/ACL patch with ext2/3) then your Windows users who connect to your Samba shares don't get all the fine-grained permission control to which they are accustomed; Samba fakes it. Combine this with winbind and you end up with almost a perfect drop-in replacement for your NT file servers, and you don't have to manage those users separately. Sah-weet.
Re:Silly question by psamuels · 2002-09-17 05:45 · Score: 1

XFS also allows you to grow the filesystem live (ie: mounted).

FWIW, so does IBM JFS. And ext2, Andreas's ext2online patch. Perhaps ext2online can be ported to ext3 some day..

--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Re:Silly question by josh+crawley · 2002-09-17 05:56 · Score: 1

---"Great... Now how does a small group of students get read write rights on a set of files/directories?"

Actually, I've been waiting for that one a long time. I've seen no unix that allows you to give users a limited set of group creation (group permissions). If I had 25 users and I was the teacher, I'd split the class into 5's. How exactly do I allow users to create groups?

In Windows, all I have to do is make a folder, take full control, and give my members read/write on the dir. If I want certain persistant files, I take away their permissions on the file secifically.

And Yes, I am a Linux/Free/SCO user. It just seems though, that WIndowsNT ACL's are superb, and (RWXR-XR--) type suck
Re:Silly question by Jeremy+Allison+-+Sam · 2002-09-17 06:36 · Score: 5, Interesting

POSIX ACLs aren't much more complex than
standard UNIX permissions and allow you to do
the 2 common cases :

1). Group finance has access + user Jill
2). Group finance has acces but not user fred.

But then again I wrote the Samba POSIX ACL
code so I'm biased :-).

Windows ACLs are a complete *nightmare* in
comparison. I still don't understand why Sun
added an incompatible varient of Windows ACLs
to NFSv4 (ie. it's close, but not the same as
the real Windows ACLs. The problem is they based
the spec. on the Microsoft documentation of how
the ACLs work. Big mistake.... :-).

Regards,

Jeremy Allison,
Samba Team.
Re:Silly question by Idaho · 2002-09-17 06:51 · Score: 2

But whatever we have now (can't remember, i barely look) to XFS? What *compelling* absolutely-must-have reason do I have to go change from whatever my installer suggests putting on for me?

None.

But there are a few reasons if you really care about your system or run a server:

1. Faster, especially when handling large dirs
2. More reliable then ext2, arguably also better then ext3 (never really compared them myself)

--
Every expression is true, for a given value of 'true'
Re:Silly question by kchayer · 2002-09-17 07:16 · Score: 2

Actually I think ACLs are the reason why everybody is running as Administrator in Windows. They are just too damn complicated.
Properly-implemented ACLs are awesome--anyone who has ever managed Novell's filesystem can attest to that. Giving access to some users while not others, working with groups, etc, is a dream. Not only that, but you don't see what you don't have access to. I'm not necessarily in favor of that all around, but that was pretty slick and can make life easier for users who don't need to see more than what they use.
Of course, the big reason people run as Administrator in Windows is because Uncle Joes Marmot Chasing Brick Breaker game will say "You must be logged in with administrative priveleges to install this game." Norton Utilities? I understand that. Some shareware game? WHY?!?

--

"I say consider this day seized!" -Hobbes
"Tomorrow we'll seize the day and throttle it!" -Calvin
Re:Silly question by Anonymous Coward · 2002-09-17 07:35 · Score: 0

So ... what compelling reason is there for me to use any other filesystem?

You don't have to watch your computer shit itself fscking's your 40 gig partition for 20 minutes if you use XFS instead of ext2.

That's the biggest reason. (My power goes out, my voodoo3 overheats when I play quake3 for too many hours at once, etc.) Those fsck's suck balls. XFS is nearly instantaneously repaired (I'm talking half a second).
Re:Silly question by Anonymous Coward · 2002-09-17 15:39 · Score: 0

Are you trying to tell us that you can't add these users to a special group that has these permissions and implement this solution? That you can't script an almost automated solution to this problem?
Sounds like a personal problem friend.
Re:Silly question by Yokaze · 2002-09-17 21:02 · Score: 2

From "man setfacl(1)":

Granting an additional user read access
setfacl -m u:lisa:r file

where lisa is an arbitrary existing username.
Which is of course, as others noted, terribly more complicated than the current chmod command.
To achieve a behaviour similar to the "inherit permissions" feature, you have to make the changes on the default acl (since they are inherited by newly created files) (according to man ACL(5) "OBJECT CREATION AND DEFAULT ACLs")

The procedure you were asking for could look this way (as a bash-script)
echo "Enter users in group (seperated by space, terminated by newline):" read for USER in $REPLY; do setfacl -m d:u:$USER:rwx .; done
The added "d:" is the short form for "default:" option, which doesn't seem to be Posix conformant.

I have to admit that I'm writing this only from theoretical knowledge, since I've no practical experience with ACLs. So any practical insight and corrections would be welcomed.

--
"Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
Re:Silly question by Libor+Vanek · 2002-09-17 21:30 · Score: 1

1, there is xfsdump - but as Linus posted once - dump is doomed SW and could sometimes get broken under Linux!!!!

Red Hat has XFS... by NewbieSpaz · 2002-09-17 03:41 · Score: 1, Insightful

For any of you Mandrake fans out there who like to bash Red Hat, and mention that Mandrake has had XFS file system included, while Red Hat does not, you would be wrong. While Red Hat does not officially support it, if at the installer's 'boot:' prompt you type in 'linux xfs', it works great. I've used it on a few systems with no problems.

--
------
Random, useless fact: I type in startx entirely with my left hand.

You missed out the flamewar on the mailing list! by Anonymous Coward · 2002-09-17 03:42 · Score: 2, Informative

For those of you who don't subscribe to the Linux kernel development mailing list, it was absolutely not a case of XFS just being accepted, there was a HUGE flamewar about it, which only ended a few days ago.

Mailing list archive

Just search in the page for XFS and you'll find the thread.

Questions... by pubjames · 2002-09-17 03:43 · Score: 3, Interesting

When is Linux 2.6 likely to be released? I know that there is no fixed date, but what are the criteria?

My second question... Does it really matter when the 'official' release comes out, when distribution makers "roll-their-own" anyway?

Sorry if these sound like dumb questions to some of you, but I'd be interested to find out.

Re:Questions... by bsharitt · 2002-09-17 03:53 · Score: 3, Funny

Most distributions should have 2.6 a couple months after it is released, and Debian will have it by 2012.
Re:Questions... by Anonymous Coward · 2002-09-17 04:11 · Score: 0

Really, there is no way to predict a 2.6 release date - if a problem is discovered, it could be delayed by months.

The best thing to do is to read the summaries of the discussion on kernel-dev, (or the whole thing if you have the time to read 300 mails a day), and see what people are talking about.

Of course, near to the 2.6 release date, 2.5 will be pretty stable anyway, so you *could* use a late 2.5.x kernel, and not have major problems, if you're lucky. It depends on how critical your system is.
Re:Questions... by mr_stark · 2002-09-17 04:12 · Score: 1

Linus has decreed that there will be a feature freeze at the end of October (last para). The stable kernel is usually released a couple of months after the feature freeze (bugs permitting).

--
I can't think of anything witty right now
Re:Questions... by iabervon · 2002-09-17 04:15 · Score: 2

Halloween is the deadline for development. After that, there will be a while tracking down bugs, and then probably a release in January or thereabouts. Linus is planning on turning things over to someone who's a better release manager (Marcello, I think), which means that the release is likely not to drag on, and likely to actually be stable when it happens.

I suspect that a number of distributions will include 2.6 pretty quickly this time, because it'll be handled by someone who is good at stability. Also, the distribution makers are actually pretty close to the 'official' process, and they're really in the best position to judge stability on a wide variety of systems. By the time 2.6 is declared stable, most of the distribution makers will be comfortable with it, both in the official version and with their patches.
Re:Questions... by AJWM · 2002-09-17 04:17 · Score: 2

IIRC, feature freeze date for 2.5 is October 31. Figure a few more months of shakeouts and bugfixes after that, we might see 2.6 sometime in first quarter of 2003.

There is a list around of the desired features for 2.6 that was put together at the Linux Kernel Summit. A very hasty web search turns up this list, which doesn't seem to mention things already merged like the block-IO stuff.

--
-- Alastair
Re:Questions... by kubrick · 2002-09-17 04:32 · Score: 2

what are the criteria?

"When it's ready." :)

Seriously, they'll release when the new features and changes they've made are stable and tested enough... and the release of a v2.6 is important, as it means it will be more widely used, more bugs found, etc. Most distrobution makers wouldn't ship a newly 'stabilised' kernel, e.g. 2.6.0, but would wait until it had matured a little...

--
deus does not exist but if he does
Re:Questions... by psamuels · 2002-09-17 04:49 · Score: 3, Informative

The stable kernel is usually released a couple of months after the feature freeze (bugs permitting).

+1, Funny. I think you mean after the code freeze, which usually happens a month later, well, two, three, ok, six months later. You also forgot to mention that Linus usually has multiple freezes, and the one on 31 Oct is only the first. With each successive freeze he puts on a more threatening tone, crying woe unto them who would dare tempt him to thaw the kernel again. Eventually the first code freeze happens, then maybe one or two more of those....

Even odds we get a 2.6.0 by June.

--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Re:Questions... by onesandzeros · 2002-09-17 06:25 · Score: 1

Maybe I shouldn't bury this so far down in a thread, but I don't think it's a bad thing that 2.6 might be a long way off (and I know that you weren't necessarily implying it either). Although it seems very fashionable for people to complain about 2.4, I think it's pretty good. My current uptime is 10 days, good considering the idiotic things I do this box. Personally, I'd like to see all those features listed at http://www.kernelnewbies.org/status/latest.html get into 2.6. And, I wouldn't mind seeing the freeze dates and ultimate release date pushed back far enough to make these features stable realities.
Re:Questions... by Arandir · 2002-09-17 06:47 · Score: 1

Does it really matter when the 'official' release comes out, when distribution makers "roll-their-own" anyway?

I'm still wondering why distros feel the need to roll their own to begin with. Bug fixes are one thing, but backports in the default kernel borders on irresponsibility. Provide alternate kernels with all the bleeding edge stuff if you wish, but when a distro says it has linux-x.y.z, then I expect to get linux-x.y.z.

--
A Government Is a Body of People, Usually Notably Ungoverned

My understanding by 0x0d0a · 2002-09-17 03:44 · Score: 3, Informative

...is that the breakdown goes something like this:

ext3:
* can be told to journal everything, including data (not just metadata) -- most theoretical reliability.
* is backwards compatible with ext2

xfs:
* tweaked for streaming large files to/from disk -- probably best at sequential reads/writes.

reiserfs:
* best performance with many, many files in a single directory.
* Can save space on very small files with -tail option

jfs:
* really don't know. :-)

--
May we never see th

Re:My understanding by 4of12 · 2002-09-17 03:51 · Score: 4, Interesting
xfs:
* tweaked for streaming large files to/from disk
-- probably best at sequential reads/writes.

Hm...would that imply that XFS would be say a really good candidate FS for building video streaming devices?

Seems like it might fit well from the perspective of:
1. high speed read write (good enough for 1080i?)
2. quick reboots due to journaling (essential for consumer electronics devices)
3. don't have a cow if there are a few bit errors in the stream
--
"Provided by the management for your protection."
Re:My understanding by +killraven · 2002-09-17 04:01 · Score: 1

Hm...would that imply that XFS would be say a really good candidate FS for building video streaming devices?

Yes. Streaming video has always been one of SGI's strong points. SGI have streaming video solutions that handle several Gb/s IO, although for those kinds of IO's you obviously need the hardware and server software as well as the filesystem. But you can rest asured that the fs won't be your bottleneck.
Re:My understanding by SquadBoy · 2002-09-17 04:02 · Score: 1

Ya think just maybe. Considering that is from SGI and all. Yes yes as a matter of fact doing huge graphics is one of things it was built to do. :)

--

Cypherpunks: Civil Liberty Through Complex Mathematics. Those who live by the sword die by the arrow.
Re:My understanding by sql*kitten · 2002-09-17 04:07 · Score: 2

Hm...would that imply that XFS would be say a really good candidate FS for building video streaming devices?

Well, yes. Which is one of the things SGI designed it for in the first place. Have you only just realized?
Re:My understanding by dmelomed · 2002-09-17 04:14 · Score: 2, Informative

Just to note: ReiserFS is also inodeless. This means you can't run out of them, as far as I can imagine.
Re:My understanding by Booker · 2002-09-17 04:25 · Score: 2

Sure, maybe use it in a Echostar DishPVR 721? :)
Re:My understanding by Anonymous Coward · 2002-09-17 04:49 · Score: 0

It's very hard to imagine modern fs without inodes. But if it can allocate inodes on-demand, that's great.
Re:My understanding by axxackall · 2002-09-17 07:53 · Score: 1

let me rephrase what you said and what I was reading somewhere else.
XFS and ReserFS are not general filesystems - they should be used only for very specific applications.
EXT3 is reliable, fast enough and compatible with EXT2, but i doesn't give transaction API to applications.
JFS support ACID transactions, but it's very new and still not very reliable.
Personally, I use EXT3 and wait for JFS. Unless/untill someone will not port/contribute the code of ACID-type FS from IBM OS/400 :)

--

Less is more !
Re:My understanding by jgarzik · 2002-09-17 08:30 · Score: 3, Informative

If reiserfs was inode-less, it would not work with Linux.
Even NTFS has inodes, they simply call them "MFT records."
Re:My understanding by Tet · 2002-09-17 08:38 · Score: 2

xfs:
* tweaked for streaming large files to/from disk -- probably best at sequential reads/writes.
Is this true? I thought the reason XFS did so well for streaming video on IRIX was GRIO, and that hasn't been ported to Linux... at least, not yet.

--
"The invisible and the non-existent look very much alike." -- Delos B. McKown
Re:My understanding by Anonymous Coward · 2002-09-17 08:40 · Score: 0

filesystems are not drive controllers. if you are streaming video, you probably only want one file per stream.
Re:My understanding by foobar104 · 2002-09-17 08:51 · Score: 2

1. high speed read write (good enough for 1080i?)

Considerably better than that, my friend. Depending on how you pack your pixels, 1080i requires between about 180 and about 260 MB/s. I can't find the source right now, but a few years SGI announced that they had done over 3 GB/s using a big XFS filesystem on IRIX.

With XFS, the filesystem is definitely not your bottleneck. Using just 8 fibre channel drives, I regularly saturate a gigabit FC loop with data (on the order of 98 MB/s, not counting overhead).
Re:My understanding by dmelomed · 2002-09-17 10:55 · Score: 1

AFAIK in ReiserFS inodes are not used the way they're in traditional FS'. You certainly need to present the inode layer to the OS, but. They use Balanced trees for block allocation. AFAIK you do not end up with a fixed number of "inodes" after ReiserFS is created.
Re:My understanding by Michael+Wardle · 2002-09-17 11:44 · Score: 1

XFS uses extents rather than blocks, meaning that contiguous data is treated as one logical unit rather than a sequence of separate blocks. This is said to improve performance for sequential access.

As extents are a fundamental component of XFS, the Linux version of XFS also uses extents.
Re:My understanding by Tet · 2002-09-17 20:41 · Score: 2

XFS uses extents rather than blocks, meaning that contiguous data is treated as one logical unit rather than a sequence of separate blocks. This is said to improve performance for sequential access.
Yes, extents will increase sequential read performance slightly, but it's GRIO that give's *guaranteed* performance, which is the desirable feature for streaming. Oh, and extent based filesystems still use blocks the same as any other filesystem. The block allocation policy is the only difference.

--
"The invisible and the non-existent look very much alike." -- Delos B. McKown

Yes! by zentec · 2002-09-17 03:44 · Score: 3, Informative

Despite being a little more resource intensive than ext3, XFS has to be one of the better file systems available. I've used it (obviously) on SGI's and it's been outstanding, and opted to use it before ext3, JFS and Reiserfs (although I believe Reiserfs is just as nifty).

Having it accepted into the kernel makes upgrades a world easier, and hopefully I'll be able to move away from SGI's modified Red Hat installation. Although, I doubt Red Hat will support it out of the box.

The other issue that needs fixing with XFS is the lack of an emergency boot disk. XFS enabled kernels are huge, and that creates a slight problem when booting from floppy.

Re:New file system-attribute. by Anonymous Coward · 2002-09-17 03:48 · Score: 3, Funny

"The round file gets all my bills. The manila one gets all my pay stubs. It works out ok. " ...and the IRS gets everything else. Time to use that 'hidden' attribute.

here's an interesting read by someonehasmyname · 2002-09-17 03:54 · Score: 4, Informative

this pdf compares how journaling file sytems compare to non-journaling systems like ffs or freebsd's soft updates.

--
Common sense is not so common.

Re:here's an interesting read by archen · 2002-09-17 09:50 · Score: 1

Interesting that they used a SCSI harddrive for testing. I noticed reading the FreeBSD documentation that they have problems with IDE drives because they tend to lie about what they're doing. Which brings up two questions. Do SCSI hard drives have this problem, and does Linux have similar problems with IDE drives?

2.6 kernel goodies by 0x0d0a · 2002-09-17 03:55 · Score: 4, Interesting

2.6 has got me more excited than recent minor releases. Some of the things that look cool:

* ALSA support. ALSA is a pain to keep patching your kernel with every redownload. ALSA is a Good Thing, if a pain in the butt to configure. My guess is that there will be decent front ends on top of the thing when distros start shipping 2.6.
* Batch priority/boosted effect of nice levels. I've always felt that "nicing" something didn't have enough effect -- nicing something by one level is almost unnoticeable. 2.6 boosts this change. It also introduces batch priority, where a process gets *no* CPU time if there is *any* non-batch process in the runnable queue. Very sexy.
* Low, low latency. Just as 2.4 emphasized good multiproc support, 2.6 is emphasizing low latency. Preemptive kernel, lots of disabled-interrupt time being reduced (especially the godawful framebuffer console), etc, etc. This is top-notch for both I/O performance and multimedia. Linux kernel 2.6 is supposed to beat any current release of Windows in audio latency when released.

The only thing that I really wish Linux had was a prioritized disk scheduler. Linux can prioritize network traffic. It can prioritize processes. It just can't do the same with disk I/O. This is a shame, since I want my MP3 player not to skip when reading MP3s/paging, followed by X getting next highest priority when paging (so that the UI doesn't freeze up for long when paging something back in), and Linux just doesn't yet have the functionality. Currently, you can have a nice 20 process that's busy untarring a large tarball...and all your paged out processes will be blocked, waiting for this stupid tarball to finish.

--
May we never see th

Re:2.6 kernel goodies by paulbd · 2002-09-17 04:11 · Score: 3, Informative

the skipping in your mp3 player has nothing to do with disk i/o. it has to do with scheduling latency. that is, unless your mp3 player has been poorly designed, which many of them have been.

also, 2.5/2.6 is still missing the better patches for low latency (from andrew morton), and so its performance is still not as good as it could be.

2.6 doesn't beat windows at audio latency when using WDM drivers for windows. it (along with 2.2 and 2.4) beat windows with MME drivers. the WDM audio driver model is very fast, and windows has always done a better job of handling scheduling latency than linux (other than with andrew's patches). in 2.4 there are still places in a mainstream kernel that will stall the entire box for up to 1/10 second.
Re:2.6 kernel goodies by LordNimon · 2002-09-17 04:13 · Score: 1

It also introduces batch priority, where a process gets *no* CPU time if there is *any* non-batch process in the runnable queue.
Even if the thread is starved? That's pretty radical. I can't imagine a use for that.

--
And the men who hold high places must be the ones who start
To mold a new reality... closer to the heart
Re:2.6 kernel goodies by ansible · 2002-09-17 04:21 · Score: 2

I don't run stuff like SETI@Home, but lots of people do. Processing blocks probably shouldn't have much priority when you're doing stuff on the desktop.
Re:2.6 kernel goodies by 0x0d0a · 2002-09-17 04:27 · Score: 3, Interesting

The skipping in your mp3 player has nothing to do with disk i/o. It has to do with scheduling latency.

Not true. I've done quite a bit of poking around this issue. I have plenty of spare CPU time, and I'm not using a sound server or similar. The problem comes when reading an mp3 from disk (and no, this is not a "DMA/umasked interrupts" is not on issue) and other *heavy* sequential disk i/o is being done by another piece of software (because of the amount of data, tar xzvf is frequently the culprit). Linux heavily weights disk scheduling towards overall performance, not fairness. Besides, this isn't mp3-specific -- other software does it too. Try cat /dev/zero > foo and then trying to ls a directory. Extremely long delay. Heck, try doing said operation when playing an mp3 and you'll see the skipping I'm talking about. Seriously, try it -- it takes about ten seconds to try.

I remember seeing benchmarks of various Windows audio latencies and Linux latencys, and at least the low-latency people had Linux at least a couple of ms below Windows. I wasn't aware that only some of these patches were going in, though, so that could be the difference between what we're talking about.

--
May we never see th
Re:2.6 kernel goodies by Lucky+Kevin · 2002-09-17 05:13 · Score: 2, Informative

* ALSA support. ALSA is a pain to keep patching your kernel with every redownload. ALSA is a Good Thing, if a pain in the butt to configure. My guess is that there will be decent front ends on top of the thing when distros start shipping 2.6.
From the ALSA site:
"2002-02-13 ALSA has been integrated to the official Linux 2.5 tree! The initial merge is in patch-2.5.5-pre1."
Yippiee! Great sound, here we come!

--
Kevin
"It's not the cough that carries you off, it's the coffin they carry you off in" O. Nash
Re:2.6 kernel goodies by slashdot_commentator · 2002-09-17 05:25 · Score: 2

The only thing that I really wish Linux had was a prioritized disk scheduler. Linux can prioritize network traffic. It can prioritize processes. It just can't do the same with disk I/O.

First things first, one step at a time. Recently, it appears that the linux kernel development group is having problems upgrading the IDE driver.

--
There is no America. There is no democracy. There is only IBM and AT&T and DuPont, Dow, General Electric, and Exxon
Re:2.6 kernel goodies by lmfr · 2002-09-17 06:25 · Score: 1

Check elvtune, try a very low read latency, a moderated high write latency, and play around with -b.
Haven't tried it myself, though.
Re:2.6 kernel goodies by Adnans · 2002-09-17 06:49 · Score: 3, Interesting

The problem comes when reading an mp3 from disk (and no, this is not a "DMA/umasked interrupts" is not on issue) and other *heavy* sequential disk i/o is being done by another piece of software (because of the amount of data, tar xzvf is frequently the culprit).

The skipping is caused by scheduling latency, as Paul suggests. I have written an mp3 player for Linux (see URL) and it only really skips when the audio output thread is not scheduled in time to satisfy the soundcard's needs. I.e. the Linux scheduler needs to make sure that whenever the audio thread wants to fill the soundcard buffers it must get the highest priority to do so. For example if you are using a soundcard buffer that is split into 2 fragments of 1024 bytes each that means that the audio thread needs to be scheduled every 6ms, 3ms for 512 byte fragments (44KHZ stereo, 16bit output). Even when your soundcard buffer size is 50 or 100ms deep you can very easily cause skipping if your audio thread is not scheduled for 100ms or longer. And this is pretty normal on a vanilla kernel for non-realtime scheduled processes. Think about it, your "cat > /dev/zero" has the same priority as your audio thread so they have equal rights to the CPU, however the audio thread has much stricter scheduling needs since you will get audio skips whenever it is scheduled too late (i.e. the soundcard buffers get depleted)

In short, the soundcard will be starved of ready to play PCM data long before the decoder will be starved of MP3 encoded data (from disk). In the end it doesn't really matter because your music still skips, but it is important to identify exactly why it's skipping.

-adnans

--
"In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." --Linus Torvalds
Re:2.6 kernel goodies by Anonymous Coward · 2002-09-17 08:08 · Score: 0

Currently, you can have a nice 20 process that's busy untarring a large tarball...and all your paged out processes will be blocked, waiting for this stupid tarball to finish.

Get more RAM, RTFM, install slackware.
Re:2.6 kernel goodies by cozziewozzie · 2002-09-17 11:30 · Score: 1

also, 2.5/2.6 is still missing the better patches for low latency (from andrew morton), and so its performance is still not as good as it could be.

I remember reading an interview with Andrew Morton -- read it here -- in which he goes into great lengths about low latency and Robert Love's kernel preemption patches.

To cut the story short, he feels that kernel preemption is the proper way forward because his low latency patch is based on manually inserting points into loops which can be interrupted. This brings more noticable improvements, but is more manual work. Andrew says that a combination of the two approaches is possible, with a lock-break mechanism implemented into later versions of the kernel-preempt patch, which does essentially the same thing as the low-latency patch.

Now I'm certainly no expert on this :-) but it seems that low latency is, in a way, incorporated into the latest kernel preemption code (cause it's no longer a patch). FWIW.

That's what menuconfig's for... by vidnet · 2002-09-17 03:58 · Score: 2, Funny

Ya just know someone out there wants to have every journaling file system on one drive just 'cuz.

Ya. And people want to have every ethernet card in one box just 'cuz, so there are a bunch of different drivers for ethernet interfaces.

My experience with XFS by chrysalis · 2002-09-17 03:58 · Score: 5, Interesting

I've been running Gentoo Linux for some times with XFS. Here's my experience with this filesystem :

- It's extremely reliable. Filesystems never got corrupted, even after a lot of ugly reboots.

- Recoveries after a crash are really fast. Almost immedate, better than ext3 and reiserfs.

- Every needed tool is available to resize filesystems, check filesystems, analyze filesystems and backup/restore filesystems.

- _BUT_ there's something strange. Basically during disk I/O, the whole system is unresponsive. While I'm compiling something, KDE becomes slow, playing videos is not smooth at all, etc. Just as if it didn't scale at all for concurrent disk access. So I finally switched back to ReiserFS just because of this. Maybe the 2.5.x series of kernel behaves differently.

--
{{.sig}}

Re:My experience with XFS by Anonymous Coward · 2002-09-17 04:03 · Score: 0

Recoveries after a crash are really fast. Almost immedate, better than ext3 and reiserfs.
But, but... my friends told me that Linux doesn't crash! :-(
Re:My experience with XFS by red_dragon · 2002-09-17 05:06 · Score: 3, Informative

Just wondering, are you using the custom kernel from Gentoo? If so, have you compiled your kernel with either/both of the low latency patch and/or the preemptible kernel patch? What are your experiences with either of those two options when running XFS? I'd expect the use of either of those two to improve a system's responsiveness to user interaction when doing a lot of disk I/O, but if those don't help when using XFS, I wonder what kind of black magic is going on inside that code.

--
In Soviet Russia, Jesus asks: "What Would You Do?"
Re:My experience with XFS by halfelven · 2002-09-17 05:07 · Score: 1

What's your kernel? Are you using vanilla kernel with XFS patch?
I'm using the Red Hat kernel (lots of performance and stability patches) and XFS, and i've never seen the problem you describe.
Re:My experience with XFS by slashdot_commentator · 2002-09-17 05:16 · Score: 2

Your observations were anticipatable. XFS was originally designed for real-time (high speed) data streaming, namely capturing and processing video (which require A LOT of disk space). That bias in design does not lend itself to concurrent disk access performance. Interestingly, your move back to reiserfs works well with reiserfs's strengths. I use XFS, and can't say I've experienced your problems, but I haven't tried compiling and watching video at the same time.

Having said that, I can't say whether your experiences are specifically due to XFS's design, or other factors; such as XFS's implementation under linux, or your tasks requiring a lot of RAM, or CPU (which applies to compiling, playing videos, and XFS). Your problems ith XFS could be resolved with a faster or 2 CPU's or a lot more RAM.

--
There is no America. There is no democracy. There is only IBM and AT&T and DuPont, Dow, General Electric, and Exxon
Re:My experience with XFS by josh+crawley · 2002-09-17 05:40 · Score: 5, Informative

---"- Recoveries after a crash are really fast. Almost immedate, better than ext3 and reiserfs."

Hmmm.. I'd assume that ext3 wouldn't be as good.. A fix on a fix usually sucks. And then I've heard about Reiser's file truncation problems. I use Reiser and no big problems."

---"- _BUT_ there's something strange. Basically during disk I/O, the whole system is unresponsive. While I'm compiling something, KDE becomes slow, playing videos is not smooth at all, etc. Just as if it didn't scale at all for concurrent disk access. So I finally switched back to ReiserFS just because of this. Maybe the 2.5.x series of kernel behaves differently.

I've had the same problems on 2.2.X when I didn't tweak my HD's to dma66 32 bit. Try doing a:

hdparm /dev/(drive linux is on)
hdparm -tT /dev/(drive linux is on)

If you dont like those settings, Drop into single user mode, with / read only and do this command

hdparm -X66 -d1 -u1 -m16 -c3 /dev/hda

Now manually do a fsck on that partition. If you have errors, it's a bad mode. But if it works, then redo the -tT option (it's a benchmark).

Be aware that 2.4 does most of this for you, but sometimes can give to little of a setting (so your performance sucks). Then again, you could have an unsupported IDE device.

All the best..
Re:My experience with XFS by Flagran · 2002-09-17 05:43 · Score: 1

I'm having the same problem, also under Gentoo. I've found that the problem is a little more bearable with the low-latency patch and _without_ the preempt patch, but it still jerks my system around pretty hard.

--
Make love, not sigs
Re:My experience with XFS by Turmio · 2002-09-17 05:51 · Score: 2

Great points, except one small correction must be made. Not every needed too to resize an XFS filesystem exists. There's no way to reduce size o f an XFS volume. I needed this feature last May when I played around with XFS on top of LVM. It was my fault in the first place, better make a working plan beforehand next time, but still you couldn't do it. With ReiserFS, it's easy - though time - consuming to reduce a volume.
Re:My experience with XFS by nathanh · 2002-09-17 15:21 · Score: 2

Hmmm.. I'd assume that ext3 wouldn't be as good.. A fix on a fix usually sucks.

You assume incorrectly. Yes, ext3 shares the same on-disk structure as ext2, but LK members (eg, Andrew Morton) say this wasn't a compromise.
"The ext2-compatibility seems to be a bit of a political albatross for ext3, really - people appear to be of the opinion that the ext3 design was somehow compromised by the compatibility requirement. This isn't so - ext3 is a block-level journalled filesystem."
[Andrew Morton, http://www.uwsg.iu.edu/hypermail/linux/kernel/0109 .3/0000.html]

If you have to ask... by alienmole · 2002-09-17 04:00 · Score: 2

This is not intended to be a facetious answer, but if you have to ask, then you probably don't need anything other than the default filesystem provided by your distribution.

Linux is used in an incredible variety of environments, from embedded systems without disks to seriously large servers and parallel supercomputers. As you might imagine, the default filesystem isn't always ideal. But, if you're just running an ordinary single-user workstation, and aren't experiencing any noticeable performance problems related to your disk access, then there's no reason to worry about your filesystem.

So "stick with what the installer suggests from now until..." you run into a reason to do otherwise, makes sense.

Damn how depressing is this by Anonymous Coward · 2002-09-17 04:01 · Score: 0

Ya just know someone out there wants to have every journaling file system on one drive just 'cuz.

That's the sound of thousands of English teachers weeping in despair.

NTFS vs Fat 32 by Steveftoth · 2002-09-17 04:04 · Score: 1

How could you not love the fact that you never had to run scandisk ever again once you are running with NTFS? YOu can install win2000 with Fat 32 and it will work fine. But if your system ever crashes, which does happen, you will have to wait while scandisk is run.

I hate scandisk

what I would like to see by Squarewav · 2002-09-17 04:05 · Score: 1

something similar to Macs HFS+, support for extended file attributes, and the ability to move files and have the file system point to the new location when a program looks for it, I know this would add a lot of overhead to the file system, but it would make it so much easier to customize the file system. Linux needs to break away from many things that slow progress, and one of these is the "magic numbers" method of file typing, a exe tab, a user tab, and a read-only tab, just is not powerful enofe

xfs by Anonymous Coward · 2002-09-17 04:06 · Score: 0

hallelujah... i was waiting for this decission for 1 year and more now..... thank you...

Ummm... duh? by Slartibartfast · 2002-09-17 04:08 · Score: 0

1: one of the primary reasons people like XFS is because of ACL support.
2: Do -you- want your filesystem cluttered up with lookups that you don't even know exist? If you want them, create them: it's what "ln -s" does. Otherwise, it's bloat and overhead.

Now that your entire post has been invalidated, might I recommend you go and read what XFS does? Personally, I prefer ReiserFS for raw journaling. But XFS *ROCKS* for other stuff like ACLs.

$.02...

Re:Ummm... duh? by Squarewav · 2002-09-17 04:29 · Score: 1

having never heard of ACL I did a little look up of it, it appears to be away of extending attributes so that you can have multiple users per file and adjusting the attributes for each user, what I want is mime typing so that the file system know what type of file it is , for example being able to set the attributes so that a jpg file is recognized as a jpg and typing ./image.jpg will open up the proper image viewer, second I don't want to use links soft or otherwise, for example if I want to put all my .so files in one place or sorted by lib, I don't want to have links all over the place pointing to it, if I wanted a file to be there I would not have moved the file in the first place, ya I know I could change the path, but thats somthing that shood be transparent esp. if you move just one file
Re:Ummm... duh? by spitzak · 2002-09-17 04:58 · Score: 2

The "type ./image.jpg runs the image viewer", though a good idea, has nothing to do with the method by which it determines that image.jpg should run the image viewer program. I would agree that shells need this added. Without changing the shells I highly recommend that KDE/Gnome get together and make an "open" command line program so that everybody else can "double click files to open" without having to parse their data, and also so they agree on the method.

You are arguing for a method. From my experience that absolute best, without question method of determining file type is "magic bytes". This may seem strange, but magic bytes have the amazing capability of being preserved by virtually every interface anybody uses to copy files. That makes them infinitely better than any attribute. And don't give me any crap about magic bytes colliding or not identifying text files, I challenge you to find *any* file format that a real user would "double click to open" that cannot be identified by magic bytes.

The problem with magic bytes is filesystems that return the filename vastly faster than they return the data from the file. This makes it inefficient to look it up. Though probably no big deal for double-clicking, people seem to be addicted to the "icon" in the file viewer, though I have never seen anybody rely on this (try rearranging the small icons randomly on Windows and see who notices) and that requires opening every file. However "thumbnail views" seem to be catching on and these require opening the file anyway. I believe ReiserFS is addressing this and making it fast to open files. There is also a problem if the file viewer does not have read access to the files, though it probably could not read the attributes either in that case.

If you refuse to use magic bytes you must use attributes. Unfortunately you are seriously constrained by requiring an attribute that will copy by most file copying mechanisms and will be written by existing programs. It also helps if it is obvious to the user how to fix the attribute. Unfortunately I think the only workable solution is what Windows did, which was to use the file extension. My only experience with attributes is Mac OX9 and OS/X and I have always been annoyed with them, as a .jpg file will often lauch an unexpected program (such as a Classic app).

If a file system supports attributes, I strongly recommend that they be used as a "cache". This should be the obvious way to use them for "thumbnail views", but there is no reason to not do the same scheme for "types", if the type attribute is missing you then spend the time to use magic bytes to determine the type. File copying programs can strip all the attributes and the effect will be invisible to the user except for a slight slowdown the first time the file is used.

I also want to say that the Mac HFS system that preserves files is equivalent to the older Unix "hard" links that existed in the very first versions in 1970. I think these are rather an embarrassing part of Unix history and can actually be proven to be a mistake, and led to all kinds of horrible file system mantinence problems. They tried to fix it by disallowing hard links to directories, which coincidentally eliminated about 95% of the usefulness of them. "soft" links like ln -s make are much more useful, predictable, and the user and programs can control them in obvious ways.

Faster reboots by wytcld · 2002-09-17 04:08 · Score: 2

2) Journalled file systems mean fast re-boots on power outages

They mean faster reboots period because they never need to be checked on boot - so you don't get that annoying "Ahem, you've rebooted too many times, I'm going to check your hard drive while your client, who's looking over you shoulder, wonders why you re-assured him you'd only have his production server down for half a minute to install the new kernel, and I'm spending 5 minutes scanning his drives."

Of course you can turn off those checks on ext2 too, but that would be stupid.

--
"with their freedom lost all virtue lose" - Milton

Re:Faster reboots by Anonymous Coward · 2002-09-17 04:30 · Score: 0

your client ... wonders why ... I'm spending 5 minutes scanning his drives."
Ahem
root@server# touch /fastboot && reboot
Re:Faster reboots by blakestah · 2002-09-17 04:34 · Score: 2

They mean faster reboots period because they never need to be checked on boot - so you don't get that annoying "Ahem, you've rebooted too many times, I'm going to check your hard drive while your client, who's looking over you shoulder, wonders why you re-assured him you'd only have his production server down for half a minute to install the new kernel, and I'm spending 5 minutes scanning his drives."

Journalling does protect against software caused inconsistencies. It does not protect against hardward probs. Periodically, it is a VERY good idea to unmount and fsck while checking for bad blocks.
Re:Faster reboots by psamuels · 2002-09-17 04:42 · Score: 1

They mean faster reboots period because they never need to be checked on boot - so you don't get that annoying "Ahem, you've rebooted too many times, I'm going to check your hard drive while your client, who's looking over you shoulder, wonders why you re-assured him you'd only have his production server down for half a minute to install the new kernel, and I'm spending 5 minutes scanning his drives."

Yeah that check can hit at inconvenient times.

Of course you can turn off those checks on ext2 too, but that would be stupid.

Just as stupid for ext2, ext3, xfs, reiserfs, or jffs2. The check is not just for fear of bugs in the filesystem code. Any sort of hardware or software bugs could trigger inconsistencies. If a filesystem doesn't offer the option to check your disks ever N mounts or every N days, that doesn't mean your system is therefore safe from such things. You are at the mercy of every line of code you choose to compile into your kernel, every byte of firmware on your motherboard / SCSI card / drive, and every stray cosmic ray that wanders through your office. Doing a sanity check on the layout of your filesystem once in awhile is not just a good idea, it's, well, ok, it's just a good idea.

--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Re:Faster reboots by Raptor+CK · 2002-09-17 04:53 · Score: 2

An occasional fsck on a production system is quite important, I agree.

This is what scheduled downtime is for. I understand that it's a so-called "helpful measure" to automate the process, but at times, it's downright annoying. If the admin isn't bright enough to even schedule maintenance periods, then he ought to be told to clean out his desk.

Again, I completely agree with you, but I think that any system that runs periodic maintenance for the admin is really just making things a little *too* convenient.

--
Raptor
"Procrastination is great. It gives me a lot more time to do things that I'm never going to do."

Re:Steve Spurrier by Buck2 · 2002-09-17 04:08 · Score: 1

I think you have a good point there. The problem is that Tux has no offerings to contain McNabb.

It's unfortunate really, and, quite frankly, I don't see this problem being resolved by the kernel hackers anytime soon.

At least Linux is open source, so any of those coding linebackers who care to step up the proverbial plate (to mix sports) have an equal shot at helping the coach out.

--

As my father lik@(munch munch)... ....

Soft Updates in Linux? by Dan+Ost · 2002-09-17 04:09 · Score: 2, Interesting

I've been reading about the differences between
using journals and using soft updates and have
decided that soft updates is the cleaner approach.

Can anyone explain to me why the Linux community
is so enthralled with the concept of journaling
file systems while the BSD community has quietly
but unanimously embraced soft updates?

--

*sigh* back to work...

Re:Soft Updates in Linux? by Shadowlore · 2002-09-17 05:14 · Score: 1

Maybe they disagree?

--
My Suburban burns less gasoline than your Prius.
Re:Soft Updates in Linux? by guacamole · 2002-09-17 16:55 · Score: 2

Note that in addition to the Linux community, the commercial unix operating systems use journaling file systems too (including some variants that are based on UFS) as well as Microsoft's NTFS. If journaling was such an obviously wrong concept, it wouldn't be in such widespread use by now. Moreover, soft updates do not free you from having to run fsck after an unclean reboot. Using soft updates simply guarantees that it will be possible to return the file system to a consistent state but you still need to use run fsck to get there. Only recently FreeBSD added (or is still adding?) background fsck support which will actually free you from having to wait for fsck to complete during boot.

Red Hat DOES NOT has XFS... by Booker · 2002-09-17 04:10 · Score: 4, Informative

This isn't correct... if it were correct, I would not have spent so much time working on a
custom Red Hat installer for XFS. :)

There is some XFS-aware code in the Red Hat Linux installer, but there is no kernel support or userspace tools available, so what you propose simply can't work.

However, SuSE, Mandrake, Gentoo, Slackware, and Debian (to some extent) do have XFS support.

But where is e2compr by Kynde · 2002-09-17 04:12 · Score: 3, Insightful

There are systems where we simply don't and won't have enough disk space and where speed is not of the essence. We have them now, and we will continue to have them in the future.

Being a linux developer for embedded production boxes and given the current increasing interest over linux in embedded along with embedded boxes typically running _WITHOUT_ hard disks (mostly just flash chips of some sort, due to their better life-time), I cannot help wondering why the kernel mailing list shows little or no interest towards ext2 (or ext3) compression.

JFFS and JFFS2 don't come into question in most cases as they tear through the fs layers and cannot be used with IDE flash chips for example.

Alcatel even released it two weeks ago for 2.4.17... loads of people, like me, must have ported it to 2.4.19 by now. But to get ext2 compression to 2.5.XX, forget it... but why?

This little like the lack interest towards under clocking, eventhough once you've overclocked your main computer to the max, you will start looking for more silent option, if not for the desktop computer, but for the closet firewall. Even if you don't have the interest now, you will, once you shack in with a gal.

--
1 Earth is warming, 2 It's us, 3 it's royally bad, 4 we need to take action NOW

Re:But where is e2compr by byran+lei · 2002-09-17 05:18 · Score: 0

>There are systems where we simply don't and won't have enough disk
>space and where speed is not of the essence. We have them now, and we
>will continue to have them in the future.
>
>
As someone who has used compression filesystems with Amiga floppy and hardrives, I can tell you exactly why there isn't much interest in them in the Linux community. They are basically a pain-in-the-ass, that's why.
Re:But where is e2compr by cant_get_a_good_nick · 2002-09-17 06:38 · Score: 2

you will, once you shack in with a gal.

You may be asking a bit much of the typical Slashdot reader...
Re:But where is e2compr by Kynde · 2002-09-17 07:51 · Score: 2

There are systems where we simply don't and won't have enough disk
space and where speed is not of the essence. We have them now, and we
will continue to have them in the future.

As someone who has used compression filesystems with Amiga floppy and hardrives, I can tell you exactly why there isn't much interest in them in the Linux community. They are basically a pain-in-the-ass, that's why.

As someone who has used ext2 compression on linux systems, I can tell you things have changed since Amiga times and it's compressed filesystems.

e2compress is really sweet, easy to mark singular files or directories to be compressed, 4 different compression algorithms can all be used on same ext2 partitions alongside with uncompressed files/directories. It really is sweet, especially if you're in need of it, like with a 64Mb flash chipped embedded pc.

--
1 Earth is warming, 2 It's us, 3 it's royally bad, 4 we need to take action NOW

Rescue CD by nuggz · 2002-09-17 04:12 · Score: 2

This is why I think there should be more useful rescue CDs.

CD burners are quite widespread, a quick rescue image could be quite small.

And yes I know not everyone has a burner, I don't either.

Re:Rescue CD by bfree · 2002-09-18 08:00 · Score: 2
Even better would be a "standard" tool to create a rescue CD from/for a running system! So you install or significantly change your system (like install a new kernel) and run "mkrescuecd > rescue.iso" and then burn it. Better still if mkrescuecd has support for a few flags like
- --net to add your network setup to the CD
- --gnome to add your gnome desktop
- --package packagename to add a specific package, it's config files and all dependecies
With a bit of work you could ensure that your rescue CD has everything you need to get on with your work on it so that as long as your data is intact (perhaps on another hard disk or a network server) you can survive quite catastrophic problems. Anyone know of anything like this or any projects looking into it? If not perhaps I'll start (and have pondered doing it for a while) to develop my own system for my debian boxes as a proof of concept to get the ball rolling.
--
Never underestimate the dark side of the Source
Re:Rescue CD by WNight · 2002-09-19 08:42 · Score: 2

Search for Virtual Linux, a Mandrake based, CD distro that runs entirely off of the CD.

It supports all the major filesystems, has pretty well every utility that Mandrake installs come with, and fits on a CD.

If you want a business-card sized distro, check out PLAC (Portable Linux Assesment CD, or something). But it won't be as full-features of course.

The drawback of these is that they aren't customized to your system the way a rescue disk is. But perhaps they could look for an optional rescue floppy and read a bunch of settings off of it.

who cares by Anonymous Coward · 2002-09-17 04:12 · Score: 0

i'd much rather see framebuffer support for my maximum impact video card. Linux would make this indigo2 a lot more useful.

My personal experience by schon · 2002-09-17 04:13 · Score: 2

I used to run both ext3 and ReiserFS on my home machine.. my experience is that ext3 sucks..

Every month or so, I had to sit through the following:

"Warning: drive has been mounted more than 30 times, check forced" on the ext3 partition

I thought the idea for journaling was to AVOID fsck's on boot?

Re:My personal experience by koekepeer · 2002-09-17 04:16 · Score: 1

this has nothing to do with ext3

i don't remember how to unset this, but there's a way to avoid the system from checking the mount-count (or whatever its called) by simply editing some settings file.

you could also opt for not rebooting so often ;-)
Re:My personal experience by Anonymous Coward · 2002-09-17 04:20 · Score: 0

tune2fs -i 0 -c 0 /dev/hdxx
Re:My personal experience by Hobophile · 2002-09-17 04:24 · Score: 1

I believe you use 'tune2fs' to turn it off. If you format a floppy with ext2 it'll give you more details.
Re:My personal experience by kubrick · 2002-09-17 04:27 · Score: 3, Informative

# man tune2fs

(you can turn fscks off, change the number of mounts or make it time-dependent, etc.)

--
deus does not exist but if he does
Re:My personal experience by psamuels · 2002-09-17 04:27 · Score: 3, Informative

Every month or so, I had to sit through the following:

"Warning: drive has been mounted more than 30 times, check forced" on the ext3 partition

This is a safety feature. Filesystem corruption can be caused by hardware funnies as well as software bugs. Your memory could be flaky, your hard drive could be on its way out, your IDE cable could be too long, your SCSI chain could be improperly terminated, your motherboard might be iffy, your CPU could be running too hot. There might be software bugs in the generic kernel, the block / scsi drivers, the ext3 code, or even some random driver that has nothing to do with filesystems or memory management.

Because of this, ext2 and ext3 have tunable parameters for how often to force an fsck, overriding the fact that the fs is supposed to be in a known clean state. Apparently reiserfs does not have this safety feature - or does it? (I don't know.)

If this annoys you, turn it off. 'man tune2fs', or specifically,

tune2fs -c0 -i0 /dev/your/filesystem

HTH..

--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Re:My personal experience by Anonymous Coward · 2002-09-17 05:32 · Score: 0

tunefs is irrelevant here. Whenever my machine locks up (stupid samba goes apeshit every so often and I have to reboot to get access my the locked drive... yes, locked drive, not just samba, which refuses to get killed... go figure) and I have to reset, I have one or two corrupt volumes. They're all ext3. But if fsck finds problems, guess what, it's back to a 30 minute disk check, with all kinds of fs issues found. I thought ext3 was supposed to avoid all that. So much for wonders of ext3.
Re:My personal experience by Leto2 · 2002-09-17 06:43 · Score: 3, Funny

And why do you reboot every day?

--
<grub> Reading /. at -1 is like driving through Cracktown in a convertible that is stuck in 1st
Re:My personal experience by maw · 2002-09-17 09:25 · Score: 1

.\" Take this out and a Unix Demon will dog your steps from now until .\" the time_t's wrap around. You can tune a file system, but you can't tune a fish.
(From the BSD tunefs.8 manpage.)

--
You're a suburbanite.
Re:My personal experience by alienw · 2002-09-17 12:18 · Score: 1

Samba? Sounds to me like you have bad memory and/or motherboard. Use that anti-static strap next time.
Re:My personal experience by Anonymous Coward · 2002-09-17 12:20 · Score: 0

"And why do you reboot every day?"
i have three words for you: LAPTOP.

But Linux doesn't crash... by Anonymous Coward · 2002-09-17 04:16 · Score: 0

No OS withstands power cord disconnection well.

I'm sometimes a clutz and my laptop uses ext3.

And well there are experimental kernels ;)

XFS and power drops don't mix. by MedicineMan · 2002-09-17 04:17 · Score: 1

One of the things both mentioned on the gentoo-user email lists and in the XFS FAQ is that if the power drops on a system using XFS, only the *metadata* is journaled, not the data.
This can get bad fast. With a full 30 seconds between disk updates, having the power cut means binary NULLs in any file updated since sync. (http://oss.sgi.com/projects/xfs/faq.html#nulls)
Short version: If you're using XFS, make sure you are careful to have clean power available, unless you don't mind file corruption. If no clean power is handy, stick with the usual journaling filesystems.

--
Now my charms are all o'erthrown, and what strength I have's mine own... - Shakespeare, "The Temepest"

An interesting thing about XFS... by Scooby+Snacks · 2002-09-17 04:40 · Score: 3, Informative

I hear that it's the only Linux filesystem that is endian-safe. IOW, you can move it from a system of one endian type to a system of the other type and it will still work. No other filesystem for Linux currently is able to make that claim.

I find that very cool, for some reason. I guess one practical application is if you have a box that is the only one of that type (either big-endian or little-endian) that dies and you need to recover the data.

--

--
Runnin' around, robbin' banks all whacked on the Scooby Snacks...

Re:An interesting thing about XFS... by axboe · 2002-09-17 05:48 · Score: 1

What a load of crap. All the general file systems in Linux are endian-safe, and can thus be moved between little and big endian machines without any problems. Anything else would be just plain stupid.
Re:An interesting thing about XFS... by Wesley+Felter · 2002-09-17 05:56 · Score: 2

AFAIK, ext2 has been endian-safe for a while.
Re:An interesting thing about XFS... by flight666 · 2002-09-17 16:31 · Score: 2, Informative

Bzzt. Wrong answer, thanks for playing.
Ext2 has been endian safe since kernel v1.{lownum}
Ext3 has _always_ been endian safe.
reiserfs became endian-safe about 6 months ago.
Don't know, but I would suspect the same for JFS, etc.

here's a comparison by halfelven · 2002-09-17 04:46 · Score: 1

Ext3:
- compatible with Ext2
- can journal everything (data included)

XFS:
- very large volumes and files
- very good performance when writing/reading at high speeds, and/or to/from large files, and/or with concurrent access
- POSIX ACLs and extended attributes

ReiserFS:
- very fast with lots of small files

Avoid ACLs when you can (was Re:Silly question) by zoccav · 2002-09-17 04:47 · Score: 1

XFS supports ACL's (or access control lists) which are much better than standard UNIX permissions.

My experience tells me that with user/group/other protection attributes on a file you can solve practically 95% of the situations you encounter in an ACL protection schema.

The remaining 5% of the cases are likely to include a) a truly twisted protections schema implementation or b) file access protection as part of a larger badly designed application that relies on file system protection where it should have relied on a professional authentication/authorization/audit system.

ACLs are also harder to administer.

BeFS by jonr · 2002-09-17 04:52 · Score: 2

I just wish to get BeFS back. It was the best FS I've ever seen. Journaled, live queries, and FAST! Palm, it's useless to you, open it up! :)

Re:BeFS by Adnans · 2002-09-17 05:50 · Score: 2

XFS smokes BeFS. Hell, even the open source OpenBFS BeFS implementation is heaps faster than the original beast :). The other funny thing is that BeFS was actually inspired by XFS.

-adnans

--
"In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd people." --Linus Torvalds

why use XFS by halfelven · 2002-09-17 04:58 · Score: 1

- very high disk I/O performance, especially when reading/writing from/to large files
- extended attributes
- POSIX ACLs
- mature and stable

A critique of journalling filesystems by 0x0d0a · 2002-09-17 04:59 · Score: 2

This comes secondhand, and is not a personal opinion of my own, but I think it's worth mentioning here.

I had an operating systems professor that did some filesystem design work (and DBMS design work, which at a low level and especially back in the day, was pretty similar). He was pretty negative on the mass demand for journalling filesystems.

See, what people really want is filesystems that don't get corrupt. It's also kind of nice if the recovery procedure at mount is pretty fast. So they want a filesystem that is always consistent -- it's never in a state where if the power is lost, the computer will try to mount the thing and say "hmm, this isn't a proper filesystem."

So if you want to add a file, you can't just add an entry to the table of files, then create the file metadata, than complete the filesystem operation, because you could lose power and end up with only the entry, but no filemetadata...so you have a pointer to garbage on the disk.

You need some sort of atomic updating. You want to say "at this point the change I'm making to the FS is not active, at this point it is, and nowhere in between is the FS invalid".

Journalling is one method of making atomic updating -- always write in the forwards direction on the hard drive, building a journal of all actions as you go, and just using the lastest journal entry when you're reading. Journalling tends to have pretty sexy write performance, because it always writes forward and doesn't have to seek it all. It also usually has fairly lousy read performance, since you have to be sure that you're using the most up to date journal entries.

To avoid some of the slower read performance, most "journalling" filesystems on Linux only journal metadata -- the lists of files in directories, permissions, times modified and so on, because the data is what you're really worried about accessing quickly, and if the data in a file gets corrupted when you lose power, you only lose that file -- not the whole filesystem.

It's possible to use other techniques -- I believe that BSD's FFS uses a non-journalling approach to ensuring a consistent filesystem at all points in time. Despite claims both ways, I don't believe that FFS is radically faster or slower than any of Linux's journalling filesystems.

And what's my personal preference? Well, I use ext3, because I already had an ext2 filesystem, and it's awfully easy to upgrade. ext3 used to have pretty bad performance, but now it's generally on par with ReiserFS (which was ahead for a bit), except for Reiser's strongest points (like a single directory with, say, five or six thousand files in it). That being said, I suspect that most people just use ext3 in it's "metadata journalling mode", which means that it doesn't have many advantages over reiserfs.

Ext3 builds heavily on ext2, which is a pretty mature filesystem. I've had one roommate that screwed up his reiserfs filesystem a while back. I believe the bug that caused that was fixed, but it made me a bit leery of reiser at the time.

The other misgiving I have about reiser is that I'm uncomfortable with the direction that the developers are going -- very heavyweight filesystem drivers, with plugins and all sorts of stuff. I'm not sure that I want my filesystem drivers to be so complex.

On the other hand, if you have lots of very small files (not empty, just a hundred bytes or so), Reiser does a great job of keeping them from eating up more disk space than they should (normally, you have to throw 4K or so at a file, unless you've changed the block size of your FS).

XFS, as far as I can tell, wasn't really designed so much to be a general-purpose filesystem as a streaming video filesystem.

And, as I've said earlier, I don't know a thing about JFS.

Other interesting tidbits:

* ext2 is still a pretty well-designed, fast filesystem.

* All of the mentioned Linux filesystems beat the snot out of MS's FAT-16 and FAT-32 in performance and *particularly* fragmentation. The popular act of defragmenting your hard drive on Windows stems solely from the fact that FAT was not well designed for anything but the very smallest of filesystems, like a disk.

* I've heard stories that NTFS (MS's new filesystem) is still worse off from a fragmentation point of view that Linux's FSes. That's second hand, so it could be wrong.

* I know for a fact that real-world performance on NTFS (at least in the NT 4 era) is significantly slower than on ext2. I have a strong suspicion that a fair bit of that stems from the ACL security system MS uses in their filesystems. In terms of performance, ACLs are not a good choice.

--
May we never see th

Re:A critique of journalling filesystems by dmelomed · 2002-09-18 03:21 · Score: 1

"* I know for a fact that real-world performance on NTFS (at least in the NT 4 era) is significantly slower than on ext2. I have a strong suspicion that a fair bit of that stems from the ACL security system MS uses in their filesystems. In terms of performance, ACLs are not a good choice."

This may also be because people expect the 'async' performance out of other FS'. I doubt NTFS is using asynchronous updates by default. On the other hand, W2K professional uses write cache on disks by default.

it depends on the desktop by halfelven · 2002-09-17 05:02 · Score: 1

If you're like me, and you're doing lots of video processing stuff, then the ability to very quickly process files that are usually > 1GB is very neat. That's one reason to use XFS.

ok by Anonymous Coward · 2002-09-17 05:10 · Score: 0

we just finally solved the vm crisis merged the code into one coherent mess and look whats happen...someone decides to add the code now to mess with the journal file system. ...

Journalling filesytems... by or_smth · 2002-09-17 05:10 · Score: 1

I know I am really going to get nailed for this (I know, I'm just a windows kiddie) but I've been curious about this for a while. What exactly is 'journalling'? I've bene hearing plenty of buzz about ext3 being a journalling system, but I still don't have any clue what the hell journalling is. Any links, responses, etc. would be appreciated...

Re:Journalling filesytems... by fishlet · 2002-09-17 06:48 · Score: 1

Well, I'm not a kernel and FS guru either... but I'll offer you my limited understanding of it.

A journaling filesystem is a system that keeps a transaction log of what it's about to do before it actually does it. Don't ask me how it does it but it sticks it somewhere on the disk... then actually performs the real changes to the disk. The reasoning is that if your system crashes hard, the next time you reboot your computer can look at that journal (of what it was gonna do) and then complete doing it. I don't think this saves you from losing your file in the middle of a 50MB wav save... but at least the integrity of the file system itself doesn't get left in a mangled state. If anyone else has a better explanation... please feel free to correct me.

--

Blender And Linux Fan
Re:Journalling filesytems... by Anonymous Coward · 2002-09-17 06:53 · Score: 0

Briefly: journalling filesystems keep a running journal, like a database's log file, of all pending (not committed to disk) filesystem transactions. If the host system crashes, at the next reboot the filesystem only needs to replay the uncommitted transactions from the journal to be back up to speed - no long fsck needed. Very, *very* nice when that filesystem is 50+ Gb in size.

Note that all current Linux journalling filesystems except ext3 only journal the filesystem metadata, so after a crash the filesystem & its directory structures will be OK, but you can lose data in any file(s) whose data was not sync'ed to disk prior to the crash. Ext3 journals both file and filesystem data by default.

Google searches will have more detailed info.
Re:Journalling filesytems... by psamuels · 2002-09-17 07:30 · Score: 5, Informative
What exactly is 'journalling'?

Here's the basic theory. Think about what happens when you make a change on a filesystem - say you add a file to a directory. The system has to:
- add a filename entry to the directory itself
- allocate the initial blocks for the file, from the pool of free space in your filesystem
- create the inode, which is a block of information about the file. The inode includes file modification times, owner, permissions, file type (regular file? directory? etc), and the location of its actual data blocks
- if there are too many data blocks, allocate one or more "indirect blocks", which are extensions to the inode so it can hold more data blocks - inodes usually have a fixed size. Initialise these with the correct block numbers as well.
- actually write the file contents to the data blocks you have allocated
If you don't do these things in the correct order, there will be times when the on-disk structure is not consistent. For example, you may have modified the directory to include an entry for the new file, but the entry points at an inode which hasn't been filled in yet. Or the inode may be filled in, but the free space pool hasn't been updated to correspond with the data block allocations in the inode. Throw in other modifications like deleting files or making them larger or smaller, and it gets pretty complicated. If the machine happens to crash at such a time - or the power goes out and you don't have a UPS - the disk will be in an inconsistent state. This has two major consequences:
1. the filesystem checker, or fsck (the equivalent Windows utility is scandisk) will have to run next time you boot, and go over the whole structure of your filesystem, which can take minutes or even hours on a large enough disk (80 GB takes a long time unless your disks are very fast). Nobody wants to sit around for 15 minutes waiting for the server to finish rebooting.
2. depending on exactly what was written to disk in what order, the fsck utility may not even be able to restore your filesystem to a consistent state at all, or it may lose important files or directories in the process of doing so.
Journalling prevents both problems (barring bugs in your OS or hardware, of course) by writing transactions to your filesystem. Instead of making changes directly to your directories, inodes, free block maps, etc, the filesystem batches up such changes by spooling them to a separate area on disk, the journal. Then, when it has written enough such changes to account for an entire, self-consistent transaction, it puts a marker in the journal indicating "transaction complete" and starts copying these changes to their usual locations on disk. Meanwhile, the next transaction can be spooled onto the end of the journal area, and it will get its own "transaction complete" marker when it is done. A journal can hold a lot of transactions - only limited by the journal size, which is usually configurable. When a transaction has been fully copied out of the journal to its final locations, it is re-labeled "journal free space" in the journal.

How does this help? Imagine that the machine goes down while a transaction is still incomplete in the journal. Next time you boot, the OS "replays" the journal: it looks for all the completed transactions and commits each part of a transaction to its correct permanent location. It ignores journal free space, and any incomplete transactions - essentially rewinding the filesystem state to the end of the last completed transaction. There is never any danger of "partially updated" filesystem state, since each transaction starts and ends with a known-consistent state.

(Ah, but what happens it the OS goes down again while replaying a journal? No big deal: next time it boots, it just replays the same journal again, which produces the same result as it would have done the first time.)

Some simplifications, obviously, but that's the basic idea. Did it help?

The different levels of journalling have to do with whether all filesystem data is journalled or only some of it. You usually only journal metadata, which is the filesystem structure: directories, inodes, free block maps, etc. That's because copying all your file contents twice (first into the journal, then into its permanent location in the filesystem) is quite slow. The main purpose of a journal is not to guarantee pristine file contents in the event of partially written files, but to ensure a consistent view of the filesystem as a whole - so you can avoid that long fsck and avoid ever ending up with a partially or fully scrambled filesystem (modulo hardware failure, of course).

HTH..
--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Re:Journalling filesytems... by nusuth · 2002-09-17 09:00 · Score: 2

(I know, I'm just a windows kiddie)
Actually NTFS is a journalling file system; a very good one in specification and a fairly good one in implementation.

--
Gentlemen, you can't fight in here, this is the War Room!

There was no flamewar by Shadowlore · 2002-09-17 05:11 · Score: 1

You call that a flamewar? Puh-leaze!

There wa sno flamewar. Disagreement about various things, yes, but certainly no flamewar.

Nor was the thread itself huge by lkml standards.

--
My Suburban burns less gasoline than your Prius.

Re:There was no flamewar by Anonymous Coward · 2002-09-17 05:16 · Score: 0

Heh, you obviously weren't CC'ed in on a lot of the comments that were not CC'ed to the list, then.

that's not exactly accurate by halfelven · 2002-09-17 05:15 · Score: 1

The problem you describe happens with all filesystems that do not have ordered writes: ReiserFS and JFS are also affected.
Ext3 has this "ordered mode", where metadata is commited to disk only after data was commited, therefore there's no chance to get NULLs no matter what.
A while ago, XFS had this pathological behaviour when metadata was commited after data, so the NULLs were quite a problem after power blackouts. But this was fixed since a few versions now, and there's no real difference between XFS and other journaled filesystems nowadays.

Anyway, if you care that much for your data, then you're better off using Ext3 with full journalling turned on.
Otherwise, i just use XFS everywhere, because of performance boost (ok, so i do use ReiserFS for proxy caches).

Comparison? by Anonymous Coward · 2002-09-17 05:18 · Score: 0

Could somebody post a comparison between Resier, XFS and Ext3?

The short, simple answer by Anonymous Coward · 2002-09-17 05:18 · Score: 0

Linux sucks.

Tinker Power! by fm6 · 2002-09-17 05:24 · Score: 2

So ... what compelling reason is there for me to use any other filesystem? Being more stable or better with data loss is nice, but considering I've only ever had this problem once, doesn't mean that i'll leap up and down going "oo oo! got to have blahFS!" any time soon.

Well, gee, if you don't care about the technology, why not just run Windows? Linux is for pioneers.

Anyway, the big success story for Linux is servers -- and journalling file systems make a lot of sense for servers, because they're more bulletproof. I once worked in a place with a lot of Solaris servers using a non-journaling FS. Now we had fancy UPSs so the servers could go down gracefully. But they were no help when an overloaded power main caught fire (middle of summer), sending out a gigantic surge that took out the UPSs before the power went away. It was days before all the file system repair and restore was complete.

About a year later, I was working at a place with a lot of IRIX servers. Had a power failure there too. No surge this time -- but no UPSs either. So how long before the servers were back up? About ten minutes after the power came back. XFS, like other journalling file systems, doesn't get all inconsistent when it's interrupted.

Re:Tinker Power! by Anonymous Coward · 2002-09-17 06:22 · Score: 0

> Well, gee, if you don't care about the technology, why not just run Windows? Linux is for pioneers.

Nice l33t troll!

Why is kernel-image so big? by Thagg · 2002-09-17 05:32 · Score: 3, Interesting

I recently installed Linux-XFS on one of my computers here, as I was having problems with the kjournald process under ext3 taking extremely unreasonable amounts of time -- and I had had wonderful experiences with XFS on our SGIs -- it's always been solid and fast. Various reviewers of ext3 had complained about the existence of kjournald -- disputing the need for a user-code daemon.

Several places it is mentioned, though, that the kernel image of XFS is very large, so much that you can't really fit it onto a floppy (although people over-format their floppies to get 1.8 MB or so onto them, and then the kernel might just barely fit.)

I can't understand why any filesystem should be so big -- it seems that the code to run the filesystem is almost as big as the rest of Linux put together. How can this be? Is it really all code? What could that code possibly be doing?

I studied XFS fairly extensively after I had to repair a disk that had 1 of its 23 heads fail. From the remaining 22/23rd of the disk I managed to recover almost every file and directory, by writing my own XFS filesystem interpretation code. The on-disk organization of the filesystem is fairly simple and straightforward, I can't imagine where the hundreds of K of code is going.

I won't be shocked if the answer does lie in that kjournald daemon -- that XFS is bigger than ext3 because ext3 puts most of the bloat into a user-mode daemon instead of the kernel.

thad

--
I love Mondays. On a Monday, anything is possible.

Re:Why is kernel-image so big? by psamuels · 2002-09-17 07:47 · Score: 2, Insightful

I studied XFS fairly extensively after I had to repair a disk that had 1 of its 23 heads fail. From the remaining 22/23rd of the disk I managed to recover almost every file and directory, by writing my own XFS filesystem interpretation code. The on-disk organization of the filesystem is fairly simple and straightforward, I can't imagine where the hundreds of K of code is going.

I've got two guesses. One, XFS has a lot of advanced features. For example, you can actually reserve disk bandwidth - assuming the disk goes a certain speed, XFS is prepared to guarantee your multimedia app that it can get a certain amount of data in or out per second. You can even have sub-partitions within your XFS partition, separated out so that only one sub-partition has guaranteed I/O rates. (I don't remember the details. It's all about multimedia, at any rate. (No pun intended.)) Then there are the more mundane features like ACLs and extended attributes.

Second, since XFS came from IRIX, I'm guessing the SGI engineers did what many people do when porting code from one system to another: reconstruct the original environment. I have no grounds for saying this - it's just a guess - but I can well imagine the SGI engineers deciding to port a lot of IRIX kernel functionality to the Linux kernel rather than adapt XFS to the native Linux way of doing things, which is a lot less featureful / bloated (pick your adjective).

I won't be shocked if the answer does lie in that kjournald daemon -- that XFS is bigger than ext3 because ext3 puts most of the bloat into a user-mode daemon instead of the kernel.

Nope, kjournald is part of the kernel - it's not a separate user-space daemon. You see it in your 'ps' output because it is a kernel thread. As such, it is scheduled like a userspace process, even though it isn't running a userspace program and never drops into user mode. Any time the Linux kernel spawns a new kernel thread, it will show up in 'ps' output that way. They usually try to name threads starting with 'k' so as to tip you off. Unfortunately this is probably not very effective for those of you who use certain non-GNOME desktop environments....

--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README
Re:Why is kernel-image so big? by Bob[Bob] · 2002-09-17 11:36 · Score: 2, Informative

You're basically correct about how SGI did the port... they created an IRIX to Linux VFS mapping layer, as described in the papers on this page: XFS Talks and Papers.
Re:Why is kernel-image so big? by rtscts · 2002-09-17 15:42 · Score: 1

Ouch. What kind of performance hit would that cause? (no I didn't read the link :)

It's been done: Plan9 is the name by F2F · 2002-09-17 05:37 · Score: 2

oh no, it's the plan9 from bell-labs operating system all over again :)

reiser is just implementing what others have done long time ago:

http://plan9.bell-labs.com/sys/doc/9.html

Re:It's been done: Plan9 is the name by Ian+Bicking · 2002-09-17 06:46 · Score: 2

Forgive me if I'm wrong (or better, correct me)... but I didn't think Plan9 did this sort of thing. At least, not the file metadata, or plugin interfaces to files. It just placed a number of kernel interfaces into the filesystem. There's no (significant) reference to "metadata" in that document, and in one instance where they are talking about permissions, they are talking about using chmod on a process, but not the more novel echo newGroupOwner > /processes/processID/group
I think what Reiser is talking about could be truly novel -- I'm sure someone has thought of it before, but I don't know that anyone's made it happen in a real OS. (Though I wouldn't be surprised to see it in an experimental OS)
Re:It's been done: Plan9 is the name by IamTheRealMike · 2002-09-17 07:11 · Score: 2

No no, Plan9 has similar methods but different goals. Plan9 moved various APIs into the filing system. The aim was similar in a general sense, to increase the power of the filing system and so to increase interconnectedness of components.
Plan9 moved for instance the windowing API into the filing system. What ReiserFS is doing is to increase the power of the filing system, turning it into effect a very powerful semi-structured database. The aim is to improve storage of data, not to move APIs around and into different forms.
I've looked high and low, and the closest I think you'll find to this was BeFS, which was really little more than a slick implementation of extended attributes on a normal filing system, certainly not the entire rethink that Hans is advocating.
Microsoft are trying something similar afaik with Longhorn, but that's based on SQL Server last I heard. That's bad for reasons explained in the white paper (imposing unneccesary structure).

Bitkeeper by Anonymous Coward · 2002-09-17 05:42 · Score: 0

Why the hell are Linus et al. using Bitkeeper which has one of the most ridiculous and annoying licenses known to man? I mean, if you read it, it's actually worse than Mickeysoft's!

Christ, it's enough to make me want to go to FreeBSD so that I can feel pure.

duh! just like any other journalled file system by RelliK · 2002-09-17 05:44 · Score: 2

One of the things both mentioned on the gentoo-user email lists and in the XFS FAQ is that if the power drops on a system using XFS, only the *metadata* is journaled, not the data.

Yes, just like NTFS, ReiserFS, JFS, and ext3 (by default). That's kind of the whole point of journaled file systems: in the even of a crash, power failure, etc. you are guaranteed to get a consistent file system, though some data may be lost. Basically, you may lose a few files but you never lose the whole file system. ext3 is the only file system I know of that gives you an _option_ to journal data as well, but that makes it really slow.

--
___
If you think big enough, you'll never have to do it.

Re:duh! just like any other journalled file system by Anonymous Coward · 2002-09-17 06:49 · Score: 0

> Yes, just like NTFS, ReiserFS, JFS, and ext3 (by default).

Incorrect for ext3, it uses the ordered data mode by default, and that journals both file and filesystem data.

It's why I use it instead of the other journalling filesystems.

Bummer! by rocjoe71 · 2002-09-17 05:47 · Score: 2

I lost my entire OGG collection after a powercut because it was on an XFS partition.

I survived powercuts and brownouts just fine when everything was on ReiserFS...

--
Height: 38U, Weight: 0 Newtons, Eyes: #0000FF, OS: Gray Matter 1.0 (Alpha)

Re:Bummer! by Anonymous Coward · 2002-09-17 09:31 · Score: 0

Presumably your entire OGG collection was written to the disk in less than an hour! (before it had a chance to write things)

Just remember to do a sync of the disk, and you are fine.

Also, it sounds like you get a lot of power outages. You might want to think about getting something like a UPS.
Re:Bummer! by rocjoe71 · 2002-09-17 09:59 · Score: 1

Actually, no.
It had been on the new partition for about a week IIRC. I did an fsck or whatever the XFS equivalent was and for all its scanning it couldn't locate the superblocks and blah, blah blah. If/when I go back to XFS I'll have a go at sync-ing the disk, is that like doing a COMMIT TRAN in SQL?
Unfortunate as this was recovery is not such a problem as I have backups on an arcane media format called "Audio CD"!
...I'm so 1986.

--
Height: 38U, Weight: 0 Newtons, Eyes: #0000FF, OS: Gray Matter 1.0 (Alpha)

Related question by Quixote · 2002-09-17 06:01 · Score: 3, Interesting

XFS has a file size limit of 32TB (or so, I think), with a _filesystem_ limit in the EBs. But, I've heard that the Linux VFS layer has a max file size limit of 1TB. Is it possible to create files > 1TB on a Linux+XFS box ? Unfortunately, I don't have the resources to try it out just yet... :-)

Re:Related question by jim3e8 · 2002-09-17 08:12 · Score: 1

Sure you do. All you have to do (if you're programmatically inclined) is open a file for writing, seek to somewhere greater than 1TB into the file, then close it. The resulting file will not actually take up > 1TB of space, because the filesystem doesn't allocate any space for a "hole" like you've just created, until you write into it. However, ls will show the full size. Keep on increasing the limit until you find the maximum file size.
Re:Related question by foobar104 · 2002-09-17 09:06 · Score: 3, Informative

Just FYI, XFS on IRIX can support files up to 9 million terabytes (9 EB) and filesystems up to 18 million terabytes (18 EB).

It's more complex under Linux. Here's the Linux-specific answer to this question from the FAQ:
Q: Does XFS support large files (bigger then 2GB)?

Yes, XFS supports files larger then 2GB. The large file support (LFS) is largely dependent on the C library of your computer. Glibc 2.2 and higher has full LFS support. If your C lib does not support it you will get errors that the valued is too large for the defined data type.

Userland software needs to be compiled against the LFS compliant C lib in order to work. You will be able to create 2GB+ files on non LFS systems but the tools will not be able to stat them.

Distributions based on Glibc 2.2.x and higher will function normally. Note that some userspace programs like tcsh do not correctly behave even if they are compiled against glibc 2.2.x

You may need to contact your vendor/developer if this is the case.

Here is a snippet of email conversation with Steve Lord on the topic of the maximum filesize of XFS under linux.

I would challenge any filesystem running on Linux on an ia32, and using the page cache to get past the practical limit of 16 Tbytes using buffered I/O. At this point you run out of space to address pages in the cache since the core kernel code uses a 32 bit number as the index number of a page in the cache.

As for XFS itself, this is a constant definition from the code:

#define XFS_MAX_FILE_OFFSET ((long long)((1ULL<<63)-1ULL))

So 2^63 bytes is theoretically possible.

All of this is ignoring the current limitation of 2 Tbytes of address space for block devices (including logical volumes). The only way to get a file bigger than this of course is to have large holes in it. And to get past 16 Tbytes you have to used direct I/O.

Which would would mean a theoretical 8388608TB file size. Large enough?

Complimentary XFL Joke by Anonymous Coward · 2002-09-17 06:01 · Score: 0

So is this the X-treme File System? Sure hope it lasts longer than the XFL did. The XFL was a great league, if only it lasted longer. I was really hoping to see who would win the million dollar game this year.

What ever happened to Tux2 by Anonymous Coward · 2002-09-17 06:03 · Score: 0

What ever happened to the Tux2 file system project? The concept behind Tux2 is far superior. I am disappointed it has kind of peter'd out. XFS was my runner-up for file system projects to root for. So I give them the "high-five" ... but what ever happened to Tux2?

Re:What ever happened to Tux2 by Jeremy+Allison+-+Sam · 2002-09-17 06:42 · Score: 2

Software patents. The author is frightened of a
patent held by Network Appliance that seems to
cover Tux2 and so has discontinued work on it.

Never mind he has prior art, they have more
money and lawyers. Welcome to software
development in the USA in the 21st Century....

Jeremy Allison,
Samba Team.

Boot partition by mark_space2001 · 2002-09-17 06:04 · Score: 2, Insightful

> XFS enabled kernels are huge, and that creates a slight problem when booting from floppy.

I think the trick to this is to have a /boot partition, and a /root partition, and make them both ext2. Then you can boot from a floppy, and then boot the larger image on the boot partition. That was the reason given for having those partitions in the Linux Stadard Base documents, anyway.

But I'm an engineer, not an IT person, so I could be mistaken as I've never attempted to do it myself.

Transactions? by ndecker · 2002-09-17 06:38 · Score: 2, Interesting

Is there any FS/API that allows ACID style transactions for applications on filesystems?

This way it would allow cool stuff like garanteed data consistency or rollback.

Imagine

$ begin_trans
$ rm -rf /
$ rollback_trans

Re:Transactions? by Anonymous Coward · 2002-09-17 07:26 · Score: 0

Not from the kernel, but sleepycat DB2/3
has an transaction manager as user library that
supports this. You could use it for your own applications too.
Re:Transactions? by psamuels · 2002-09-17 09:09 · Score: 1

Is there any FS/API that allows ACID style transactions for applications on filesystems?

Not to my knowledge, but you can get quite a bit of mileage out of the fact that mkdir() and rename() are guaranteed atomic. (rename() over an existing file is particularly useful.) Sprinkle in fsync() and file locking as needed.

I know this isn't what you really want - atomic updates of multiple files at once....

--
"How can you claim that you are anti-crack, while still writing a window manager?" — Metacity README

ext2fs corrupts data by Anonymous Coward · 2002-09-17 07:34 · Score: 0

It is a little known fact that ext2fs corrupts data on power failure because it does not do synchronous writes nor journaling. For all these years, linux systems have been in danger of losing data, while BSD systems have been rock solid with the world-class FFS filesystem.

Linux disk priorities by 0x0d0a · 2002-09-17 07:39 · Score: 2

The skipping is caused by scheduling latency, as Paul suggests.

I can stand behind my claim that it isn't. I've had multiple processes running loops with the same priority running, plus flipping around in X (which I keep at nice -10) to test specifically this, and even with my CPU at 100% load, things stay okay.

*Only* heavy, long term disk activity causes this. I even tried jacking the mp3 player up to nice -20...that's not it.

I had a friend run the same test on his SuSE box with Reiser and 2.4.18 -- same problem, at least as vulnerable. It isn't the filesystem, it's the disk scheduling.

This also isn't short term "blips" -- even if you have other processes eating CPU time, the mp3 player should be getting CPU time frequently. I have #define HZ 1000, and if there is one other process running, I should be seeing 500 context switches a second. Take into account a few extremely low daemons, and you still are getting a good 100 context switches a second. That's way more than enough. I'm talking about a total mp3 dropout for *many seconds*.

In short, the soundcard will be starved of ready to play PCM data long before the decoder will be starved of MP3 encoded data (from disk).

I agree. However, in my case, CPU is nowhere near a scarce resource, but the disk is completely saturated.

I'll make you a challenge. Denice your mp3 player. Run five bash scripts doing infinite loops (while [ true ]; do done;), not niced. At least on my system, this produces no dropouts at all (though six just start to do so). Then run *one* cat process, niced to 20. It will cause dropouts (at least with the mp3 player I was testing with, mpg321, which doesn't precache the whole thing in RAM).

Furthermore, the dropouts I get with the bash scripts, CPU latency issues, are significantly different. They are very short, and sound like static -- a buffer or two wasn't filled. The dropouts I see with the nice 20 cat are far longer, lasting a second, then two, then three.

The problem is that Linux doesn't have any sense of disk scheduling priorities attached to processes. I mean, that's not egregious -- AFAIK, neither does Windows, and I haven't heard of any general purpose OS doing this (though I haven't looked around). But it would be nice, and though some workarounds exist (such as the elvtune that someone else pointed out), I have to specify that latency be emphasized for the whole disk -- what I really want to do is say that "this one process needs high disk priority".

--
May we never see th

Re:Linux disk priorities by 0x0d0a · 2002-09-17 07:42 · Score: 2

Whoops -- a little free with terminology. The mp3 player should be "getting 100 context switches to itself". "Seeing 100 context switches" sounds like there's only 100 per second on the whole system. :-)

--
May we never see th
Re:Linux disk priorities by Anonymous Coward · 2002-09-17 14:53 · Score: 0

I have noticed that it is very hard to get mp3 playing to skip on FreeBSD (with X11Amp [and its newer version] it is almost impossible to do: they use real-time scheduling). I suspect FreeBSD has some form of I/O scheduling going on.

IBM OS/400 by axxackall · 2002-09-17 07:47 · Score: 1

I did not work with it by myself, but my friend told me that he used to work with FS as with SQL DB in a manner you described. I wonder if there are any plans to port OS/400's FS to Linux.

--

Less is more !

THANKS! by 0x0d0a · 2002-09-17 08:20 · Score: 2

Still not as technically elegant as I would have liked, but *damn*! Changing the read elevator max latency from 8192 to 32 makes a tremendous difference. No more breakups with the workloads I mentioned!

--
May we never see th

priority io in filesystems by xtp · 2002-09-17 09:38 · Score: 1

I implemented this once as a special patch in Irix/XFS. It's not sufficient to do it just in the file system. The priority must be passed down to the devices. With Fibre Channel devices supporting a good-sized output queue, it is important to utilize the priority queuing supported by the disk channel vlsi chip. Unless there is device and driver support for priority queues, reordering the disk block queue at higher levels in the OS is usually not effective.

This kind of priority is best-suited for real-time processes that have real-time needs for disk access. The problem of keeping a big tarball unpacking operation from sucking up lots of space may be better solved by adding space constraints rather than having prioritized i/o.

Re:priority io in filesystems by 0x0d0a · 2002-09-17 10:02 · Score: 2

This kind of priority is best suited for real-time processes that have real-time needs for disk access.

Yeah. Like my mp3 player. :-)

It's not sufficient to do it just in the file system...With Fibre Channel devices supporting a good-sized output queue, it is important to utilize the priority queuing supported by the disk channel vlsi chip.

This isn't Serial ATA or anything. Plain old vanilla ATA. No command queuing. :-)

--
May we never see th

Unix Permissions BE GONE! by Anonymous Coward · 2002-09-17 10:07 · Score: 0

Finally, a filesystems where one dont just have the old user/group/other for granting permissions. XFS supports posix ACL(and EA's), meaning now one can grant arbitary users and groups the appropriate permissions. GREAT!

Re:Unix Permissions BE GONE! by alyandon · 2002-09-19 09:19 · Score: 1

I find the *nix permission system adequate for most purposes. However, I will gleefully embrace the concept of having ACL (should I choose to use XFS) at my disposal due to the fine-grained control it can provide.

ACLs are simple, and necessary by Nailer · 2002-09-17 10:09 · Score: 2

Actually I think ACLs are the reason why everybody is running as Administrator in Windows. They are just too damn complicated.

They're not. A short ACL is no more complicated than RWX permissions and would achieve the same purpose. However, when you need real find-grained access control, they're there.

Since RWX permissions don't offer any kind of granular security, admins must hack other access control methods on top of them in software to get the security they need. Which is a lot more complicated.

Besides, your routers and proxy servers are already using them, as are most industrial strength Unixes. ACLs are needed to ensure granular permissions, and are necessary for anyone who runs a file server.

Imagine a bunch of HR documents on a server. These documents are used for things like, for example, firing people.

HR need to access and modify the files
Management need to access, but not modify the files
Nobody else gets any access

Its a common scenario, and one rwx simply can't do unless you hack other access control schemes on top. Which is much more complicated than a three line ACL.

One guy's experience with Reiser, XFS, and Ext3 by PhotoGuy · 2002-09-17 10:22 · Score: 2

I've used all three of Resier, XFS, and Ext3fs on a couple of different machines.

I've had *all three* result in a corrupted file systems for me, not due to hardware failures, but improper shutdown (typically on laptops that don't resume from a restore, or otherwise die suddenly).

I would say the severity of the corruption in each case, for me, from greatest to least, would be XFS, Reiser, and Ext3.

From what I've read, XFS's design is by *far* the most advanced, and the potential reliability and performance are far greater than the other two; the white papers on it are truly impressive. Hopefully, the corruption I experienced was due to using an earlier version, and that those glitches have been resolved.

The ext3 corruption was in some ways the most frightening, because no problem was ever actually reported, until a fsck I did one day reported a whole bunch of problems.

I think that once people try out the mature XFS, that it's benefits will become more and more obvious, and hopefully someday it will be the default filesystem in major distributions.

--
Love many, trust a few, do harm to none.

Re:One guy's experience with Reiser, XFS, and Ext3 by parabyte · 2002-09-17 11:40 · Score: 1

I made exactly the opposite experience; after switching to XFS, we had no problems with corrupted filesystems any more, on about 30 different development machines.

I will never use ReiserFS again because this trouble I had.

p.

--
Without order, nothing can exist. Without chaos, nothing can be created.
Re:One guy's experience with Reiser, XFS, and Ext3 by PhotoGuy · 2002-09-17 12:41 · Score: 2

I think a lot of this could be timing. I used a pretty early XFS. I suspect they rapidly worked out the early porting bugs, giving the traditional XFS stability.

--
Love many, trust a few, do harm to none.
Re:One guy's experience with Reiser, XFS, and Ext3 by Anonymous Coward · 2002-09-19 09:16 · Score: 0

I've yet to suffer major corruption using the ext3 filesystem due to sudden freezes/shutdowns. Then again, I can say the same thing about NTFS.

I'm probably lucky I guess.

Dude, you just answered your own question by Nailer · 2002-09-17 10:40 · Score: 2

SGI's modified Red Hat installation...

The other issue that needs fixing with XFS is the lack of an emergency boot disk

Pop in the SGI modifed Red Hat, boot off the CD, and type `linux rescue'.

It's about friggin time! by jmeff · 2002-09-17 13:22 · Score: 1

XFS was only the 2nd journaling file system available and useable on linux, after ReiserFS, and before ext3 and jfs. Quite odd that cmdrtaco would get the impression now that there's too many jfs's in the kernel, where have you been I ask? It's been there for a couple of years, and without a problem with over 300 days uptime.

XFS is more advanced than ReiserFS, JFS, and ext3 in terms of fs feature support on linux, the only thing missing I believe is shrinking the fs online.

More on inodes (was Re:My understanding) by jgarzik · 2002-09-17 15:27 · Score: 3, Informative

AFAIK in ReiserFS inodes are not used the way they're in traditional FS'. You certainly need to present the inode layer to the OS, but. They use Balanced trees for block allocation. AFAIK you do not end up with a fixed number of "inodes" after ReiserFS is created.

You're mixing filesystem features up. To clear things up a bit,

Individual inode records need not be of a fixed size.
The inode table (total number of inodes) need not be a fixed size, and it can even be moved around, and spread across, various physical locations on the disk.
The inode table can either have a special-cased storage method (ext2/3), or simply be stored using the filesystem's own block allocation methods -- in effect treating the inode table as a "normal file" (jfs, ntfs, several others) This second method has the property of being very flexible: just as it is trivial to extend the length of a normal file [i.e. append], it is trivial to add new inodes to an inode table that the filesystem treats internally as a "normal file."

There are wild and varied ways to store inodes. But ReiserFS definitely has them. :)

Regards,

Jeff

271 comments