State Of The Filesystem

Transferring Files by jhunsake · 2003-07-14 23:18 · Score: 5, Interesting

I've always wondered how these filesystems with metadata handle transferring files between different systems. It would suck to have all your MP3 info in filesystem metadata and then lose it all when you transferred to a system without fs metadata. Anyone have any insight?

Re:Transferring Files by ThogScully · 2003-07-14 23:24 · Score: 5, Interesting

I would think that would be prevented by the filesystem implementation. A filesystem's features are only as good as the implementation. So if you have fs metadata and are transferring it to another filesystem, whatever tool you are using to move the files, whether it be at a BASH prompt or drag and drop in Konqueror, should figure some way of properly translating data from one system to another without losing anything. Moreover, the tool should be able to do it without knowing any particulars about the actual types of filesystems in use. It's all in the implentation.
-N

--
I've nothing to say here...
Re:Transferring Files by sleeper0 · 2003-07-14 23:25 · Score: 3, Interesting

i am skeptical as well, the only reasonable answer seems to be that it would get lost. Especially if you are transfering between platforms. Why would you want to use a metadata system that you can't count on anyone else using? It seems to me that current packaged in file metadata systems are much more practical for transfer, multi-platform use, streaming, etc.

Other things mentioned in the article like the /etc/passwd manipulation just seem like gimmicks. How many files like that would you by hand support before you ended up deciding it was a bad idea to start it in the first place.

If these are the things they are coming up with to add to filesystems now, they must be pretty much figured out & done.
Re:Transferring Files by peatbakke · 2003-07-14 23:56 · Score: 4, Interesting

There are a few of ways to solve the problem, but only one of them ensures compatibility across many different platforms:

The first is the worst: having a separate metadata file accompany each of your files as they move between file systems. This is a logistical nightmare, and requires soooo many tools to be rebuilt that it's not even worth considering.

The second is the "bundle" concept, as popularly demonstrated in OS X. "iTunes.app" is actually a directory, not a big binary file. The directory is split-up into different sub directories, containing multiple binary formats, languages, etc. To transfer iTunes to another computer, just tar/zip the "iTunes.app" directory, and viola. Easy. Of course, the app will only run on platforms that recognize the bundle format, and can execute the binary. This rules out backwards compatability, and requires that all major system vendors agree on the same bundle format -- and that's not gonna happen.

The third way to do it is to keep metadata within a file (like MP3 tags), and then export them into the filesystem space. For example, an MP3 plugin would pull the data out of "my.mp3" and turn it into "my.mp3/title" "my.mp3/genre" etc. The nice thing about this is that it's platform independent, and backwards compatable.

I'd be keen to see option #3 implimented in a modern file system -- do any current FS's have an API that could support it?
Re:Transferring Files by Surak · 2003-07-15 00:18 · Score: 5, Interesting

Well, remember Reiser4, for instance, is a TOTALLY DIFFERENT system than the standard POSIX filesystems (Reiser3, Ext2/3, UFS, etc.)

The idea is that data is stored as in a database, and small files can be handled efficiently.

Think of it this way: What *is* a filesystem? Really, it's a relational database. We don't tend to think of it as one, because it isn't implemented as one, but that's what the data it represents is. If you think of each file and directory and associated metadata as a record, with directories being pointer records to other records that show relationships.

Anyways, code needs to be implemented to handle this.

We use metadata NOW. A files permissions, owner/group info, date and time stamp, and even the filesize are metadata. We transfer those between different systems right now. With the correct transport method, this metadata doesn't get lost at all. A tarball can sit on any system no problem. Inside the tarball is all the POSIX metadata. That gets transferred from system to system.

Of course, some OSes (read: Windows) don't handle Unix-style permissions correctly, so these are translated into rough-equivalent ACLs with systems such as Samba.

--
My journal has hot /. gossip.
Re:Transferring Files by ThogScully · 2003-07-15 00:28 · Score: 4, Interesting

If the other filesystem has no way to store metadata, then perhaps someone will write a sort of add-on for that filesystem that attaches an XML file to something.mp3 that is called something.mp3.meta that will contain the metadata and because that person is running the "patched" filesystem, nothing will be lost.

As for SSH, that's just one of the tools I spoke of - the tools will have to implement the features of the filesystems - it's not up to the filesystems. If a filesystem addon like I specified above becomes common enough, then perhaps it too will become supported in various tools.

Ultimately, the point remains that the tools will support newer filesystem features and do their best to operate as translators from one filesystem to another. If one filesystem really just can't support the data on another, then something will have to be lost, but if that's the case, then perhaps it's time to upgrade the older filesystem.
-N

--
I've nothing to say here...
Re:Transferring Files by dash2 · 2003-07-15 01:02 · Score: 3, Interesting

True, but as he pointed out for chmod, chown etc., most of these command line user tools could still be implemented, sometimes even as shell scripts.

Obviously, no clever new idea will go anywhere (except on e.g. embedded systems) unless it is compatible with the huge base of existing Unix software, command line tools included.
Re:Transferring Files by jafuser · 2003-07-15 03:29 · Score: 4, Interesting

With the system that was proposed, you would probably need some kind of file packager system (with a rather extensive and flexible library) to assemble a file and it's meta-data into a standard "network" file type before you ship it off to a computer which does not have this kind of filesystem.

Probably something like:

$> mkfile ~/songs/OompaLoopmas

Would take the fs-organized file and meta-data:
~/songs/OompaLoopmas
~/songs/OompaLoo pmas/title
~/songs/OompaLoopmas/artist
~/songs/O ompaLoopmas/genre
~/songs/OompaLoopmas/year
~/so ngs/OompaLoopmas/album
~/songs/OompaLoopmas/track
~/songs/OompaLoopmas/..uid
~/songs/OompaLoopmas /..gid
~/songs/OompaLoopmas/..rwx
~/songs/OompaL oopmas/..type (contains "audio/mpeg")

and make the following basic "networkable" file, with no meta-data other than standard file-system meta-data:

~/songs/OompaLoopmas.mp3
~/songs/OompaLoopmas.m p3/..uid
~/songs/OompaLoopmas.mp3/..gid
~/songs/ OompaLoopmas.mp3/..rwx
~/songs/OompaLoopmas.mp3/. .type (contains "application/octet-stream")

Applications which use network sockets would be aware that before sending a file across the network, it should probably convert it into a network file first.

Perhaps there would be a way to fopen the file with a flag indicating it needs to be run through the library which converts the file into it's original "network form" and feed that stream on the file handle returned from fopen().

It would be much more taxing on the filesystem to change the original file to reflect the meta-data every time something in that meta-data is changed. If one byte of data needs to be added to the front of a long file, that could require the whole file to be re-written. These kinds of changes will probably occur a lot more frequently than the times where a file needs to be made "ready" to go out on the network.

For that reason, I think it would probably be more efficient if the conversion process is made at the time the data is copied to or from the network. A system of libraries to do this conversion would probably be more elegant, and would be much more well integrated with the pupropse of the file system described in the article.

As a bonus, the packager could perhaps also be intelligent enough to do file conversions through some additional modular libraries (if they are available).

$> mkfile -t ogg ~/songs/OompaLoopmas

Obviously you wouldn't be able to convert all files to all types.

$> mkfile -t jpg ~/songs/OompaLoopmas
Error: ~/songs/OompaLoopas contains 'audio/mpeg' and cannot be converted to 'image/jpeg'.

I like most of the ideas proposed in the article, and I hope they get a chance to make it into the mainstream, though I'm sure it will take quite a bit of time before that can happen.

The only thing I'm not quite comfortable about is the mount layers idea, as that seems it could make room for a lot of confusion if it's use is not kept to a minimum.

--
Please consider making an automatic monthly recurring donation to the EFF

Incompatibilities with another system by panurge · 2003-07-14 23:37 · Score: 3, Interesting

Windows support for metadata has always sucked, recognised by every Mac user who moved to a PC and discovered that you had to tell the system what a file did by appending a clumsy tla to the end, and passing gently over the inconsistencies of the support for long and short filenames.

If Linux and related systems move to filesystems with really powerful metadata support, presumably the lockin would be much stronger. Moving a directory from Linux to a Windows system may be possible but the programming to do it will become increasingly painful and the risk of data loss will rise. And with mainframe integrity, why would you want to, Mr. customer?

Apart from the CS issues, is this an attempt to use the embrace, extend weapons of Microsoft against it by turning the Linux filesystem into a full mainframe system, effectively squeezing out Windows servers by a convergence between big tin and small boxes? I guess this is pretty pie in the sky but I'd like to think so.

--
Panurge has posted for the last time. Thanks for the positive moderations.

Reasons to change to another FS by N_gaAdy · 2003-07-14 23:46 · Score: 4, Interesting

I agree that we need a revolution in how filesystems work inside an operating system, but it seems that the arguments placed in this paper had alot of holes.

For one thing, the need for changing a filesystem should not really be solely concerned on space or metadata. I think security, speed of data retrieval, and self correcting error engines should be centered on the new systems.

The reason for the speed of data retrieval as being more important than data size is because hardrives are getting much bigger than they are faster. In five years, we may have 20 terabyte drives, but the access speeds will still be horrible.

Security and error correction are obvious points that should be implemented on a systemwide level. When these features are system wide, then management becomes much easier for all system users.

Data, even metadata, belongs in files, not fs by Fastball · 2003-07-14 23:53 · Score: 3, Interesting

I agree that metadata in the filesystem is a risky proposition. Just on general principle, I prefer my data inside the file and not left with the filesystem. The MP3 metadata example, to me, is like Windows file extensions on HGH. I remember John Dvorak wrote a piece about Windows file extensions a long while back, and he argued that file types, etc. should be inside the file. A header of sorts. I tended to agree then, and I see filesystem metadata as a bad trend.

No need for LDAP? by jackalope · 2003-07-15 00:10 · Score: 5, Interesting

It seems that the author presumed that the only use of LDAP is to provide passwords for user authentication. While that is a common use of LDAP it is not the only use.

It would seem that having a file system that is LDAP aware could be extremely useful. Imagine if your LDAP tree were reflected as a tree in your file system. You wouldn't need to embed LDAP calls in your application, it would just be data in your file system. So looking up an attribute for the current user, or a user, would be as simple as reading a file that holds the value of the attribute.

What I REALLY want by afidel · 2003-07-15 00:16 · Score: 4, Interesting

Is for someone to come up with a real unlimited snapshotting filesystem for linux. I don't want to use user mode hacks (as nice as they are rsync style snapshotting isn't reliable enough), or snapshotting that only allows a shadow copy of the entire volume, I want to be able to tell the users that they can just go into ~/.snapshot/time (where time can be hours, days, or weeks in the past) and copy the file they messed up back into their home directory. Basically I want the most usefull feature of netapps without the HUGE markup =) The cost in admin time both in user interaction and reduced need to do tape retrieval and file restores is immense.

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.

Re:What I REALLY want by IamTheRealMike · 2003-07-15 00:54 · Score: 4, Interesting

That kind of thing would benefit from this, especially if integrated into the GUI in a nice way.
In effect, it'd be similar to what Linux does with memory - empty disk space is useless! What is the point of that? Might as well use it for something. Storing old versions of files is an ideal use for it, and combined with an LVM means that the more space you have, the further back in time you can go. That's far more flexible and useful than storing stuff in a recycle bin.
Of course, you run into problems straight away if you aren't using lots of small files. Are you really going to store a new copy of that 5meg presentation every time you hit save? Bad idea. You only want to store what has changed.
But - how? Doing it with text is easy enough, we have diff and patch for that. But what about more complex formats?
Ahhh, now we understand why minimizing primitives is useful in the real world. If the internal structure of a file is split into many small files, each representing a facet of the presentation (each slide, each bullet point on the slide, font weight of that bullet point etc) then suddenly it becomes much simpler to write a generic differential analyzer, we can just snapshot the mini files that changed.
As a result, we increase efficiency to a level at which it becomes feasable to keep the undo transaction buffer on disk - just imagine how much hassle we'd save if all apps used this reliably?
There are still plenty of details missing of course - some things still wouldn't work well, in particular large BLOBs with no structure, like say images, or audio files. I guess if you were smart you could leverage the apps transaction buffer, but still..... that would lead to interop problems.
Still, there are many unexplored possibilities that appear when you increase the abilities and even efficiency of low level parts of the OS like this.

Seems good for some applications,... by ddubois · 2003-07-15 00:23 · Score: 3, Interesting

I feel this very interessting.
For example I like the currents devfs and procfs (although not perfect). Those help me a lot to "debug" new hardware connection. SImple and coherent.

I also imagine some RDB support. Imagine a "select * from account" where account would be asimple directory. Would be nice to use "find" for the where clause :-)

Imagine also implementting OO classes wiht inheritenace using symlinks... And more...

Yes would be very nice!
D.

Re:Good article by Anonymous Coward · 2003-07-15 00:23 · Score: 3, Interesting

Actually, ZIP is extensible. You can add new data blocks to the ZIP file and if you try to unzip it with a decompressor that doesn't understand your extensions, they'll just be ignored. The Be ZIP implementation extends ZIP with support for its native file system attributes meta data, for example.

TAR isn't so much showing its age as dribling into a bib in the old folks home.

The logical conclusion of the Tao of Unix by The+Monster · 2003-07-15 00:30 · Score: 4, Interesting

Reiser's vision of files that are also directories, and putting metadata into files, is just the logical perfection of the fundamental *nix notion that 'everything is a file'. It is this which has given the OS such strength for three decades - before anyone knew about 'object-oriented programming', Unix was behaving in an o-o manner at the macro level, with individual programs doing encapsulation, etc. Having plugins that expose the details as ordinary files, allowing you to access information with your editor of choice, as well as scripting tools such as sed, grep, awk, perl, or Python is what Unix is all about.

The only thing I'm concerned about is backward compatibility - if someone accidentally tries to open a file with a trailing slash, and gets an error because now it's a directory, then it's a Bad Thing.

--

[100% ISO 646 Compliant]
SVM, ERGO MONSTRO.

Metadata in files? by steveha · 2003-07-15 04:25 · Score: 3, Interesting

Just on general principle, I prefer my data inside the file and not left with the filesystem. The MP3 metadata example, to me, is like Windows file extensions on HGH.

I'm with you -- I like self-contained file formats.

But I don't think he was proposing that you not use Ogg tags or MP3 tags; he was talking about the filesystem abstracting the tags. If you changed "Stagnation.ogg/album" to the string "Trespass", then the filesystem abstraction layer should update the Ogg "album" tag inside the file to be "Trespass".

The key benefit here is that you would not need some wacky command-line utility program to let you view and change tags on Ogg files. You could just use the shell. In bash:

for ii in *.ogg; do echo "Trespass" > $ii/album; done

Note that this same one-liner would work if you were in a directory with MP3 files, and you changed "*.ogg" to "*.mp3". Currently, you need to run vorbiscomment for your Ogg files, and mp3info for MP3 files. (I just checked, and sure enough, they take different arguments to do the same operations.)

Personally, I'd like to see a standard metadata portable XML format for legacy systems. People talk about copying a file from a rich metadata filesystem and having new files like .attributes created on the target legacy system; I'd be happier if just one big XML file could be created with the same name as the original file.

Suppose you backup server //rich onto server //legacy, and then you want to restore some files from //legacy to //rich. If all the metadata was stored in a big XML file, then when you copy the file from //legacy to //rich you restore all the metadata; you wouldn't accidentally slice off attributes by forgetting to copy one or more rich attributes files.

You could do most of the fancy tricks of the rich metadata filesystem on a legacy filesystem that used the big XML file to store the rich metadata. And as long as the legacy system is just smart enough to look at the main data part of the XML and leave the metadata tags alone, you could still modify the file with sed, awk, perl or whatever, and then copy the big XML file onto your rich metadata filesystem and still not lose any rich metadata.

Note also that the big XML file could be used to deal with existing rich metadata systems, like resource forks from Macintosh filesystems, or multiple data streams from NTFS files.

steveha

--
lf(1): it's like ls(1) but sorts filenames by extension, tersely

18 of 424 comments (clear)