State Of The Filesystem

very Reiser/plan9 specific... by grey1 · 2003-07-14 23:35 · Score: 4, Insightful

...and not very general. Interesting for its comments on what's being tried out in R-FS & Plan9 but certainly doesn't manage to be a general summary of what's going on.

How about the changes coming in 2.6 (like xfs support built in)?

The article makes some good points but for me it could have done with rewriting to make it more general, separate the analysis of filesystem implementation problems from technical detail, and included more examples from other file systems.

--
"we demand rigidly defined areas of doubt and uncertainty!"

Re:Transferring Files by groomed · 2003-07-14 23:36 · Score: 4, Insightful

Generally you lose the data, unless you wrap it in another format to encapsulate all the information. This is what Macheads did on Classic MacOS: they .hqx'd or .bin'd their files before transferring them to another system. It's not ideal. The alternative, flat streams-of-bytes, is not ideal either (and not true: even in Unix, files have some metadata that doesn't translate very well).

Hopefully in the future our filesystems and transfer protocols will evolve to have some reasonably broad common ground where metadata is concerned (a development similar to the diminishing need to accomodate DOS 8+3 filenames).

Re:Transferring Files by axxackall · 2003-07-14 23:37 · Score: 4, Insightful

Metadata is very application specific and most of filesystem are agnostic about it. Typically it must be handled by another layer on a top of FS.

Often that layer is a DB - database. I suggest you to try ZODB, database in Zope, it's very good to handle files as documents - with many unified metadata about files.

Another good example to study is Subversion, which is revisionining/versioning metadata-management layer on a top of a regular FS.

You may research and find some software implementing a layer (on a top of a regular FS) specially designed to handle MP3 playlists. But again, that would be a layer on a top of FS, not a filesystem by itself.

--

Less is more !

Re:Why? by cperciva · 2003-07-14 23:41 · Score: 4, Insightful

You're missing the point. chmod would still exist as a userland program; it is the kernel call which would be removed.

To the user, there would be no change; to the userland programmer, there would be no change; to the C library developer, there would be a change (to implement chmod in terms of the existing filesystem operations); and to the kernel developer there would be a change (mostly in the direction of reduced complexity due to a smaller number of necessary functions).

--
Tarsnap: Online backups for the truly paranoid

Security? by Inode+Jones · 2003-07-14 23:41 · Score: 4, Insightful

Before adopting any of these ideas, one must consider the security implications of doing so.

If we assume that the filesystem is decoupled from the access control layer in the kernel, then one must ensure that any operation that potentially affects security is adequately controlled.

For example, on systems with POSIX_RESTRICTED_CHOWN, the following ought to be illegal:

cp foo/..uid bar/..uid

This can be accomplished by making the UIDs mode 444. Without POSIX_RESTRICTED_CHOWN, the UID is 644. However, we have now moved a systemwide security feature into the filesystem. If multiple filesystems are configured into one kernel, then they ought to be consistent; otherwise the security model will be flawed.

As for things such as allowing access to an environment, doesn't that break encapsulation? It means for a certain filename, the filesystem must grovel through a user-space process to find the environment. Also, if a change in some external environment immediately affects some partially-related processes (e.g. daemons started from that shell), then a whole new raft of security holes will come up based on a process' environment or filesystem layout changing unexpectedly.

Cool ideas, but let's be careful lest we make a steaming pile of Swiss cheese.

Re:Why? by IamTheRealMike · 2003-07-14 23:43 · Score: 4, Insightful

I think people are distracted by the examples he gave, which make the point clear but are perhaps not representative of what this would be used for in real life.

GConf was a better example. ATM using GConf is, well, not hard, but you have a lot of extra machinery involved, new APIs to learn and so on. Basically all that machinery does is control the backends and give change notification (it does stuff like schema validation as well).

It'd be *much* easier to use GConf if in order to read a value, you didn't have to load up the GConf libs (which in turn depend on CORBA), or parse XML files. At the moment that's really the only way to do it, but in most environments/languages it's far easier to manipulate files and directories than it is to talk to a CORBA server or bind APIs into them.

You also get an increase in efficiency. Parsing XML is kind of cludgy - XML is not a particularly efficient format to store stuff in. It's a good compromise between humans and machines, but both of us have to do lots more work to meet in the middle. The reason it's used, rather than lots of small files, is that otherwise GConf would be too slow. In fact, they are already talking about removing yet more of the files/directories to speed things up, and sticking them all in the same XML file.

Being able to have a configuration system that truly leveraged the filing system would make a lot of stuff easier, more reliable, and faster (because you can take advantage of filing systems that are really really tuned to take advantage of advanced data structures).

It won't really impact the way you do things like set file attributes today. Most of the changes would be under the hood. But used well, everything would become easier for the developers, and so more advanced and slicker for the user.

databases need transactions? by ajs · 2003-07-14 23:43 · Score: 4, Insightful

I'm really getting tired of the ever-creeping assertion that transactions are required for [x]. At first x was ACID-compliant relational databases, and such was true because ACID was defined as such. However, then I started to see assertions that relational databases had to be ACID-compliant (mostly from the anti-MySQL camps who were ignoring the long history of highly valuable, non-ACID relational databases).

Now, in this article, I see the assertion that databases in general require transactions, and thus cannot be supported by a filesystem.

Worse, the logic is self-refuting, as the article previously states that a filesystem is a database, just a limited one. As it happens, POSIX-type filesystems are quite powerful, and let's not kid ourselves into thinking that they have not served us well for 20-30 years! Yes, changes are coming and I'm frankly quite impressed by Hans Reiser's accomplishment in finally coming up with a balanced-tree-based filesystem. Many have tried and failed where he succeeded.

That's because his was a great step forward, not because the old UNIX filesystems weren't also. Let's stop trying to re-define terms so that we can explain why the last 20 years were the dark-ages. They simply were not.

Re:databases need transactions? by afidel · 2003-07-14 23:58 · Score: 4, Insightful

Transactions are required for Reliable databases and filesystems. If you don't mind occassional corruption then you can throw out transactions, otherwise you need them and you need to eat the cost (memory, access speed, cpu overhead, whatever). Since PC's are generally faster then most people need at the moment making them more reliable seems like a worthwhile goal.

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.

Sorry, but by Morth · 2003-07-14 23:55 · Score: 5, Insightful

This article seems to just be the author brainstorming or feeling excited about reiserfs. It's hardly a "summary of developments in the filesystem". Now if he was asking about opinions on his article it'd be fine, but he's not, so I'll just discard this as another non-news.

Re:no, no, no by GammaTau · 2003-07-15 00:04 · Score: 4, Insightful

if that crock, that bag-on-the-side, that mess is what we have to look forward to, I think i'll switch to BSD.

I mean, acessing owner data by travelling into a directory then backwards out of it again like: vi /directory/..owner is a big ugly crock.

Are you so sure that you would hate it? After reading the PDF, I was thinking of two things:

He suggests that files ought to be used for everything
The old wisdom: "Everything in Unix is a file"

I don't think it's any more radical than treating network sockets as files. Sure, it might feel a little weird first, but once you'd get used to it, the simplicity would overweigh the clumsiness of existing implementations.

It's also very easy to wrap together a shell script that imitates the existing implementations and put it to /bin/chown or whatever you wish to replace.

Re:good read, but less relevant by sultanoslack · 2003-07-15 00:44 · Score: 4, Insightful

This is said by someone who obviously hasn't done any real world application profiling. It's quite the opposite -- CPU is relatively rarely a limiting factor in desktop applications, dealing with the HDD very often is.

This is very often why adding more memory to a system makes it seem more responsive -- larger disk buffers, less need for disk based virtual memory.

Basically hard disks are very often *the* limitation; CPUs are fast.

Some of these ideas are VERY short sighted by irw · 2003-07-15 01:04 · Score: 5, Insightful

I wish people with clever ideas to redesign POSIX namespaces would spend ten years in system administration first so they realise what's involved with managing REAL WORKING SYSTEMS.

Some of the ideas might well lead in useful directions, but some (at least as described in the paper) are plain silly. viz:

1) with overlayed mounts:

suppose my home dir is mounted read-write over a read-only system root, and I do not have a "/bin/prog" in my home dir. Consider:

cp /bin/prog /bin/prog

First time, it copies the system /bin/prog into my home fs - Counter-intuitive to the path semantics. If I run this a second time it copies my copy of /bin/prog over itself - Inconsistent.

2) Attributes in the namespace

We have a rather carefully written setuid chown/chgrp/chmod replacement which can be run by users in an "admin" group, and allows devolution of 1st-line support tasks to nominated users. It won't touch files whose uid/gid is 100, so they can only touch non-system files.

If attributes (file uid) is file/..uid and cp is supposed to handle what chown does, the above breaks big-time. We now need a custom cp replacement. Either that or we have to add an ACL for the admin group to every file we want them to manage, which is a great deal of effort, and likely end up inconsistent.

Contrary to the paper, setuid and PARTICULARLY setgid is NOT going to go away in the real world any time soon, as far as files are concerned. Ports less than 1024 are a different matter and I agree with the document.

3) Consider the number of file descriptors involved if /etc/passwd becomes a hierarchy of files. Just logging in one user will involve multiple open()-read()-close() operations. Whilst these might be efficiently implementable at fs-level, it is still very inefficient in user space, or will at least require a dramatic rethink of unix tools.

Re:Transferring Files by greenrd · 2003-07-15 02:15 · Score: 4, Insightful

If a filesystem introduces new types of metadata, they don't magically get supported by tar.

Yes they do, if they're seen by tar as ordinary files. That was one of the main points of the article, which not many people here seem to have read (as per usual).

--
Female Prison Rape in NY

Re:Transferring Files by zatz · 2003-07-15 02:31 · Score: 5, Insightful

A filesystem is nothing like an relational database. I wish people would stop making this comparison, because it's completely misleading and unhelpful. A filesystem is not a set of user-defined tables, each of which contains an unordered set of rows. Queries and joins are not possible. Constraints and null values are not supported. Files within a directory have an inherent order. Files are variable-length and byte-addressable. Duplicate "rows" are not permitted. The principle relationship modeled is hierarchy... ever heard of a hierarchical database?

--

Java: the COBOL of the new millenium.

14 of 424 comments (clear)