Interview With Reiser4 Author Hans Reiser
An anonymous reader writes "KernelTrap has an interesting interview with Hans Reiser, the author of two revolutionary Linux filesystems, Reiser3 and Reiser4. Reiser3 was the first journaling Linux filesystem. Reiser4 is a complete rewrite that is claimed to offer amazing performance and a new plugin architecture offering semantic enhancements to rival Microsoft's WinFS and Apple's Spotlight. Comparing Reiser4 to WinFS, Reiser says in the interview, "Reiser4 is a much more mature design, representing a 10 year effort"."
I was wondering over the weekend, on a whim, whether it would make sense to create a cross-platform library that abstracts meta-data/search functionality. Like, it would provide one uniform set of utility functions, and this would turn into calls to WinFS on windows, calls to Spotlight on OS X, and calls to ReiserFS on Linux.
;)
But I don't know enough about WinFS OR Spotlight Or ReiserFS to know if this would be even remotely useful or is just nonsense
Irritable, left-wing and possibly humorous bumper stickers and t-shirts
I've been waiting until it's deemed "safe" to use, but it seems it's going on 2 years now or "not ready yet". I know it's ready when it's ready, but is there a timetable for it? I don't have a fast enough spare box to test it out, and I want to dig the faster FS perf on an SATA harddrive. Keep going Hans!
bad_outlook
--
Is this vague enough for you?
I recently switched a laptop from Linux with ReiserFS3 filesystems to FreeBSD 5.4 using UFS2 filesystems. The size of the filesystems were the same, and the usage pattern (program development, web browsing, etc.) the same.
The UFS2 filesystems had the feel of being quicker than the ReiserFS3 filesystems. That said, I do not have any numerical data to back this up. However, untarring a large tarball consisting of many smallish files under FreeBSD felt quicker than doing the same under Linux.
Would this difference be caused by the filesystems themselves, or would it most likely be a difference between the Linux and FreeBSD IO subsystems? Would ReiserFS4 be more comparable, if not better than, FreeBSD's UFS2 for workstation-style workloads?
Cyric Zndovzny at your service.
Comparing ReiserFS and WinFS is a bit like comparing Qt and Explorer - nonsensical. They're different things, operating at different levels, to serve different purposes.
Come on, how are the parties involved supposed to carry any credibility when making such a *basic* and *fundamental* misunderstanding - /WinFS is not a filesystem/. They also seem to misunderstand what Spotlight is - again comparing it as a filesystem, when it isn't.
Last time I used Reiser I had to reformat back to ext. The starving problem basically made the kernel freeze when flushing buffers during large streaming writes. Is the Large writes starve reads issue gone yet? When I say large I am referring to streaming 12 gig (hour of DV) in a continuous write.
James
The thread you link to nicely illustrates the political manoevering necessary to get a filesystem accepted into the kernel. This is one good reason why filesystems should be implementable in userspace.
There are so many wonderful things that can be done with filesystems once they can be added from userspace. How about transparently accessing files through SSH or FTP, from any application?
There are various tricks that allow filesystems to be implemented in userspace, such as LUFS and FUSE. Other filesystems (especially the ones that are portable to other systems) pretend to be NFS.
All of these suffer a performance penalty, but I wonder how much that really matters when you're interacting with disks or networks, which are very slow compared to the CPU and RAM anyway.
Many things besides filesystems would benefit from being implementable in userspace, but filesystems are what I personally have thought about most.
Please correct me if I got my facts wrong.
I really would like a metadata driven system. Instead of the traditional file dialog for saving or opening files it would be cool to just specify some metadata and have it thrown on a heap of files. I think this is kind of what winFS is trying to accomplish, but above the filesystem level. Hopefully that is in the future of every OS. And if not, or is some better idea comes along, then I guess some time in the future I will pick up a database implementation book and a file systems book, study up and work on it myself.
Normally you have to release something before it can mature. OTherwise its called development...
/how/ its structured, how devels will be able to use it, how we'll be architecting solutions with Reiser4 plugins, it'd be much appreciated.
Still waiting on that plugin system, thanks. Should be good though. No hurry, but if you could even begin to release some info on
-Lord "I hope I havent missed anything in all these years waiting" Myren
One of the driving concepts behind ReiserFS is that metadata is nothing special, and it should be presented in the same namespace as the files themselves. If you read the article, it talks about using 'cat' and other simple tools to manipulate the metadata. Think something like 'cat /home/foo/music/some.mp3/artist' to display the person who performed a song.
It depends on the metadata. Think about file permissions. That's metadata. All the files you create are given defaults based on your umask, and you can go in and change them at any time.
In order to expose some of this metadata to the end-user in a GUI, yes, there will probably need to be some new UI work done. It doesn't all just magically work, it has to be presented to the end-user in some way that will make sense to them. So what I would expect is that the filesystem and plugins will be finished and done, and able to be manipulated by programs and shell scripts, and then it will take further work to integrate this metadata support into GUIs and file managers in a way that's useful to non-power-users.
Bogtha Bogtha Bogtha
but it is horrible for a large networked filestore. The heirarchies that the secretaries at work come up with are convoluted at best, and it takes a long weekend to even attempt to comprehend the logic of their naming convention. When they lose a file, or forget what it was named, when they last worked on it, but can tell you that it was an ISO file (which in itself is ironic), coupled with the fact that they often change the file extensions on their files to random numbers, or try to change it to .pdf to save it as a pdf file, metadata makes a lot of sense.
``Well, what I want to know is: How do I get to this metadata? Some extra tool? Some right-click option that I have to select every time I create a file? Will all File dialog boxes have to be rewritten, and will I have to manually input all this info?''
;-) Maybe we should revive LISP and use reiser4 for efficient storage of CLOS objects? It's an idea I've been toying with for a while...it's crazy enough that it might just work. In the meantime, I already have a simple filesystem for Scheme objects on paper...I just have to get enough time one day to implement it and see if it's usable in practice.
How does metadata get into the ID3 tags of MP3s and the comments in Ogg Vorbis files? Wouldn't it be nice if that info were available through a standard interface? Wouldn't it be nice if the same interface provided access to metadata about movies? Webpages? Images? Search for all movies longer than 2 hours, or search for images of 1024x768 resolution? BeOS has a pretty nice interface to metadata. These sorts of searches are now starting to crop up in Windows, too.
Reiser4 is about the only file system that implements metadata properly and efficiently. This could be a killer feature for Linux, if only Reiser4 were accepted in the kernel and some software written to take advantage of the features. It shouldn't be too hard to put some functionality in GIMP, Nautilus, some command-line tools, etc.
``Once there's an application which can find all pictures of my dog, or songs with piano in them, and store THAT in the metadata, which I can search somehow, call me.''
AI is soooo 1960s.
Please correct me if I got my facts wrong.
Over the past several years, we had pretty good luck with using Reiser on root and data filesystems. Good luck in the sense that we never encountered something we couldn't recover from. However, we did have more than one instance of filesystem corruption that would crash the kernel (We used it on several of our development servers). The warnings on the 'rebuild tree' utility weren't very reassuring, but it always seemed to work. We also had instances of corrupt files by sticking random bits of data of other files at the end.
I'm migrating our servers slowly over to ext3 as we upgrade them, mainly because it is more mainstream and I prefer my source code sunny side up as opposed to scrambled. I noticed that the same number of files seems to take up less room (10% or so?) on the disk overall with Reiser than with Ext3 (as reported by df).
ayershome.org/users/eric
I've been thinking about starting up a file system project (as you do), and was wondering if anyone has thought of using something like the FUSE kernel module with a database (say MySQL or Berkeley DB) to create an easily indexible file system. The idea is to create a basic proof of concept using FUSE and if it gets any interest turn it into a proper (kernelspace) FS.
What sort of problems can I expect to face?
The topic in channel #gentoo-amd64 on irc.freenode.net has said "Reiser4 is evil" for more than a year. Does anyone know if Reiser4 actually works in a x86_64 environment?
Not really looked at GPFS, but if IBM's history is anything to go by (JFS, M:N threading, the DAISY code translator, etc) it'll be revolutionary, be an inspiration to a thousand projects, and get forgotten as it is overtaken.
Sad, but true - IBM has done masses for Linux, in terms of proof-of-concpet, forcing the pace and introducing new idead. Unfortunately, they then drop the ball. It hasn't mattered much, as others have gone running with the ideas, but it would be nice if IBM saw a real return on their investment by keeping on until their technology is adopted.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Another thing I thought sounded cool was the ability to cat /home/foo/music/some.mp3/raw > /dev/dsp and the mp3 would just play by using a plugin that ran it through an mp3 library. This would allow application developers to just access file/raw rather than worrying about file types and conversions.
If I'm writing an image viewing program I no longer have to worry about hooks to libjpg, libungif, libpng, libevery image file type available. Let the OS care about file types and let applications deal with raw data and focus on interface rather than file types.
Paying taxes to buy civilization is like paying a hooker to buy love.
Well, that makes a lot more sense now, and surely the poster was just confused rather than malicious.
Linus and I disagree on this point, a pity that.
Well, noone stops you from opening and reading 'file/raw/bps', 'file/raw/endianness' etc. As long as we can agree on the common namespace for all audio files, I don't see why it won't work.
/dev/dsp" case.
OK, but that won't work for the "cat file.mp3/raw >
The point of my criticism is that raw audio data isn't self-describing, so unlike text, you can't pipe it around without supplying some metadata. IMO a better solution than what you propose is to support an interface like file.mp3/wav, which is raw data, but has a WAV header to tell you how to interpret it.