Interview With Reiser4 Author Hans Reiser
An anonymous reader writes "KernelTrap has an interesting interview with Hans Reiser, the author of two revolutionary Linux filesystems, Reiser3 and Reiser4. Reiser3 was the first journaling Linux filesystem. Reiser4 is a complete rewrite that is claimed to offer amazing performance and a new plugin architecture offering semantic enhancements to rival Microsoft's WinFS and Apple's Spotlight. Comparing Reiser4 to WinFS, Reiser says in the interview, "Reiser4 is a much more mature design, representing a 10 year effort"."
Berkeley was a lot better than junior high school, but it still involved homework, which deep down in my heart I could never believe in.
I hear you. I always avoided homework as much as possible too.
Bradley Holt
Having other people agree with him?
I am Hans, and this is Franz, and we want to [clap] journal your filesystem.
Ya. Ya. All you little girly men with your FAT and NTFS!
Really, Ya. Makes me sad to see such pathetic file systems!
WinFS exists and is still in development. It's just not set to debut as part of Vista. Whether that means it will debut significantly after that I don't know, though. I think there's an alpha or beta version of WinFS available to developers now.
Honor Among Slackers. A veri
Here's a link to the page that hides the asshats making the pages super-wide with lame comments.
This one is pretty old review back when Reiser4 was still Experimental. More recent one would be here, but it too is over a year old.
-----BEGIN PGP SIGNATURE-----
12345
-----END PGP SIGNATURE-----
Anything more specific? I'm the first to admit that he can be rather immature, spoiled and inflammatory but a quick look at the link you offered showed none of these attitudes. Actually the discussion sounded quite civilized, so what's the problem?
Don't think of it as a flame---it's more like an argument that does 3d6 fire damage
If benchmarks are even halfway legit, then this is indeed something amazing.
Never attribute to malice that which can be explained by mere idiocy.
There's a beta version available to developers now. There was a lot of criticism of it when it was first unveiled so they went back to the drawing board and released a new version that claims to address those concerns.
This is my sig.
XFS beta came out 9/22/2000 (its source code was first publically available on 3/30/2000) The first journaling version of reiserFS was release in november of 1999.
JFS came later than XFS though I can't find the date.
Of course officially ext3 came out before reiserFS in september of 1999, though ext3 is the real winner. Which produced workable code first I have no idea.
The german magazine iX ran benches on Reiser4 a while ago, and the benchmarks indeed were impressing with two huge downsides however, one is already mentioned in the interview, a reallocator is needed because Reiser4 has a tendency to fragment. And the other one being a much higer CPU usage than every other filesystem.
Comparing ReiserFS and WinFS is a bit like comparing Qt and Explorer - nonsensical. They're different things, operating at different levels, to serve different purposes.
Come on, how are the parties involved supposed to carry any credibility when making such a *basic* and *fundamental* misunderstanding - /WinFS is not a filesystem/. They also seem to misunderstand what Spotlight is - again comparing it as a filesystem, when it isn't.
Last time I used Reiser I had to reformat back to ext. The starving problem basically made the kernel freeze when flushing buffers during large streaming writes. Is the Large writes starve reads issue gone yet? When I say large I am referring to streaming 12 gig (hour of DV) in a continuous write.
James
It's an INTERVIEW with the system's author and he's giving his opinion. Which, come to think of it, is what one DOES in an interview, you know, ask someone what they think? Sheesh.
Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
My understanding is that the kernel developers have pointed out flaws in the benchmarks and he has accepted the criticisms but points out that they are just benchmarks and all benchmarks have flaws. This would not be a problem if he didn't keep referring to the benchmarks when trying to ram a change into the kernel. You can't have it both ways.
It's also my understanding that the key reason kernel developers don't want to accept his patches is that they don't like big megapatches that affect many systems or replicate functionality that is already in the kernel -- it's bad for maintenance. It's also my understanding that he doesn't want to break up the patches himself and he has refused help from others who are eager to do it for him. For him, it's an all or nothing deal -- take it or leave it. The kernel developers say "fine, we'll leave it", but he doesn't accept their decision and continues to complain. Again, you can't have it both ways.
Reiser may be a genius, but even geniuses have to (*gasp*) live in the real world and negotiate with real people. Even if Reiser is smarter than all the kernel developers (doubtful), it pays to treat your so-called "inferiors" with respect. Even janitors and garbage collectors can have wisdom that we don't have and things they can teach us.
The thread you link to nicely illustrates the political manoevering necessary to get a filesystem accepted into the kernel. This is one good reason why filesystems should be implementable in userspace.
There are so many wonderful things that can be done with filesystems once they can be added from userspace. How about transparently accessing files through SSH or FTP, from any application?
There are various tricks that allow filesystems to be implemented in userspace, such as LUFS and FUSE. Other filesystems (especially the ones that are portable to other systems) pretend to be NFS.
All of these suffer a performance penalty, but I wonder how much that really matters when you're interacting with disks or networks, which are very slow compared to the CPU and RAM anyway.
Many things besides filesystems would benefit from being implementable in userspace, but filesystems are what I personally have thought about most.
Please correct me if I got my facts wrong.
Normally you have to release something before it can mature. OTherwise its called development...
/how/ its structured, how devels will be able to use it, how we'll be architecting solutions with Reiser4 plugins, it'd be much appreciated.
Still waiting on that plugin system, thanks. Should be good though. No hurry, but if you could even begin to release some info on
-Lord "I hope I havent missed anything in all these years waiting" Myren
Actually I didn't see much immaturity. Okay Linus saying microsoft will get a files system right when hell freezes over seemed a little immature.
Frankly Reiser4 looks like a good project.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
It is very stable already - I use it as the root filesystem on my laptops (mostly because laoptop disks are sooo slow and reiser4 mitigates this considerably).
/boot) on my main machine.
I have not suffered any problems whatsoever in more than a year. I have had power-cuts, battery problems, and even a few kernel panics and so forth due to ACPI bugs, and reiser4 hasn't lost a single file or even needed a fsck.
Not to mention that its fast as hell.
I still do make weekly backups though, since I don't trust the disk to survive very long - but I trust reiser4 enough to use it as my root fs (only other fs is ext2 for
One of the driving concepts behind ReiserFS is that metadata is nothing special, and it should be presented in the same namespace as the files themselves. If you read the article, it talks about using 'cat' and other simple tools to manipulate the metadata. Think something like 'cat /home/foo/music/some.mp3/artist' to display the person who performed a song.
It depends on the metadata. Think about file permissions. That's metadata. All the files you create are given defaults based on your umask, and you can go in and change them at any time.
In order to expose some of this metadata to the end-user in a GUI, yes, there will probably need to be some new UI work done. It doesn't all just magically work, it has to be presented to the end-user in some way that will make sense to them. So what I would expect is that the filesystem and plugins will be finished and done, and able to be manipulated by programs and shell scripts, and then it will take further work to integrate this metadata support into GUIs and file managers in a way that's useful to non-power-users.
Bogtha Bogtha Bogtha
Yeah, that's not a good link. Try this kerneltrap one. Things have been brewing. I haven't kept up with the most recent stuff, though.
;) If I butchered anybody's perspective, please correct me. I don't do kernel dev or psychology.
It's really a design/people issue. There are the lingering issues of stability and similar, but these are not (as I understand) the original problem.
Reiser4 incorporates some sophisticated metadata concepts ("semantics") that are in effect a software layer over the fs - which is why Hans can compare it to WinFS. Some of these features step into the functionality domain of the VFS and the kernel. Not a bad thing, per se.
Now, we all know the stereotypical kernel dev - technically conservative, concerned about maintenaince, not really keen on making big compromises, and annoyed by ego (again, a stereotype). Keep that in mind.
Hans of course wants Reiser4 into the kernel. What's the holdup (from a technical design standpoint)? Well, individuals like Andrew Morton want functionality in the kernel that can be reused in a file-system nuetral fashion. Reiser4 has a plugin system, but it's a Reiser4 plugin system. Reiser4 and Hans want to extend Linux as an API, which right now will just be for Reiser4.
There are also some lingering details of how this will change the course of filesystem integration in the kernel, in regards to traditional POSIX and Unix-like behavior. I don't recall any enduser problems, but there are few complaints.
Why might this be annoying? Well, Hans wants his fs into the kernel now and he makes the case of its superiority, the markets demand, and the need to compete with companies like Microsoft. I wouldn't be the one to tell kernel devs that they need to compete with MS, but Hans is - to say the least - confident. And he did name the filesystem after himself, so I'm not how this couldn't be personal on some level.
The middle ground is to say to Hans: we'll take Reiser4, but we want these Reiser-only features to be ultimately modified for all capable filesystems. Hans insists - and I'm sort of generalizing here - that the details can be sorted out, but right now we should go with Reiser4 and not worry about making it anything but a great fs.
So, Hans took a "assertive" position on why Reiser4 should not only be included in the Linux kernel but also change the kernel. Linus, Morton, and a few others took a stand and said - in so many words - "Hans, we aren't putting your ego into our kernel. Not even experimental."
It would be interesting to see if end users put enough momentum behind Reiser4 to put in into mainline or start it in 2.7.
Is that worth a few flames?
It would be difficult to design. If you look at what APIs exist for this sort of functionality, pretty much the only one that has a significant amount of traction is SQL. And SQL isn't exactly the nicest language to work with. It's implemented with various degrees of compliance and non-standard extensions by various databases. Outside of SQL, the landscape is even more scattered.
SQL language itself is somewhere in between being a very restrictive domain specific language and a full programming language. The way it is used in practice is by calling it from a real programming language, usually through an interface that leaves the door wide open for injection vulnerabilities.
I believe the problem is that it's difficult to figure out what functionality goes where.
If you want to get a list of all files that have been modified since monday and whose name does not start in a period, how do you proceed? Do you get a list of all files, then throw away all but the ones modified since monday, then discard all the ones whose names start in periods?
Do you get a list of all files whose names do not start in periods, then discard all files that have been modified since monday? That requires your search interface and implementation to somehow support intelligent matching of the names (more difficult than getting all files whose names start in periods).
Or do you directly query the system for what you want? In that scenario, your interface and implementation have to support complex queries, with subqueries, unequality operators, etc. Are you going to implement all this functionality, just because someone might need it? Is anyone going to be able to understand or implement your interface?
I would love it if a good and cross-platform interface were available, but I don't think it's ever going to happen. If not for the technical difficulties, than because Microsoft won't want to adhere to the standard.
Please correct me if I got my facts wrong.
This is a type of question that, unfortunately, cannot be answered correctly. Well, it can: it depends. But that's not what you are after. As Hans himself pointed out, there are some fsync performance problems with ReiserFS. If you look at PostgreSQL config files, you'll notice a "fsync" setting, and if you look at pgsql-performance mailing list, you'll see frequent mentions of fsync. Obviously, fsync affects DBs (not just PostgreSQL), and ReiserFS may not currently be so great for DBs. However, it is apparently good for large directories (1 directory, lots of files in it). So, it depends how you use your FS.
Describe how you use your FS, and maybe somebody can provide good feedback.
Simpy
I never said the above words attributed to "Rieser".
I am sure of it becuase I would absolutely never say that "Linux kernel developers do what's right because it is _right_, not because somebody else does it.".
I am just not that nice a guy that I would say such a thing.:-/ I am guilty of saying the opposite at various times. I am known for this, and not particularly liked for it.:-)
This is a forged quote. Note the false defensiveness put into it in the sentence "So there's really no point in trying to push your agenda by trying to scare people with MS activities." That really sounds like someone at MS posted this.
It does not matter so deeply that MS put it into or out of the kernel, what matters is how they layered the code relative to itself --- that is, do they use the FS API, which lacks an insert or excise operation, to repack small objects that they squished together within a file, and does that layering make things slow. I think it almost certainly does make it slow, and it definitely is inelegant.
Hans Reiser
Reiser4 Architect
Namesys
If your not smart enough to remember how you file things, how are you going to be smart enough to remember the metadata needed to extract the files out of a database?
Remember what?
When I query something, I query what I _want_. Filesystem should provide me my files - there's nothing to remember. I'm already quering amarok interface with song names and it doesn't hurts. Same for spotlight - people likes it.
Second people complain of Resier4's system overhead
I don't understand those complains. I've seen benchmarks where reiser4 eats the double of CPU time than other filesystem. But then, it finishes the task in half of the time.
Which is the whole point of a filesystem, mind you. If your filesystem is eating few CPU cycles, it means it's wasting time waiting for the disk. In a "perfect world", any filesystem would eat 100% - it'd avoid all the I/O. Reiser 4 complains about eating too many CPU can be partly because it is fast at I/O. I guess their algorightms are also very complex and burn lots of cpu cycles too - if you want to avoid I/O you need complex algorithms after all, right?
CPU cycles are cheap. What do you prefer, a fast filesystem which doesn't eats cpu cycles (because it sucks and spends all the time waiting for the disk) or a filesystem which eats CPU power because it is fast?
Well, what I want to know is: How do I get to this metadata? Some extra tool? Some right-click option that I have to select every time I create a file?
Anytime you save a file today, you're already manually specifying several pieces of metadata: the filename and the location.
Anytime you access a file today, you're already manually specifying that metadata also.
Consider how many clicks it takes to (graphically) navigate to a file from the root directory. That is exactly the number of metadata labels that you yourself supplied for that unique file's creation.
So, the obvious generalization of this is to get rid of the hierarchy concept entirely. Then, as an earlier poster described, I can naturally tag my music by artist and by genre, instead of using symlinks to cut across trees.
More practically, it would allow applications to install themselves using a unique tag, so that uninstalling (or moving, or archiving) the application requires just one query on just one tag, and is guaranteed to turn up any associated file regardless of its "location."
--
Dum de dum.
Freedom is not the license to do what we like, it is the power to do what we ought.
Those are the words of Linus Torvalds in response to someone suggesting that Reiser4 should be merged, in order to stay competetive with WinFS and Spotlight. To counter the reasoning, Linus Torvalds stated the following seperate points:
1. WinFS is not the real filesystem, the real filesystem (NTFS) still runs in kernel mode. WinFS is "merely" a set of libraries in user-space, like gnome-storage. So, you can't derive a need to push such functionality into the Linux kernel.
2. Trying to push some functionality into the kernel with the reason to compete with Microsofts development won't work, because they do what they think is right, not in order to compete with someone else.
"Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
Well, what I want to know is: How do I get to this metadata? ...Once there's an application which can find all pictures of my dog, or songs with piano in them, and store THAT in the metadata, which I can search somehow, call me.
I take it you have not actually tried to use any of these new filesystems and their metadata. Metadata comes from lots of places. It comes from an internet database of music CD and movie DVDs. It comes from the OS intelligently reading the text contained within various file types (like text, rtf, .doc, PDF, PS, etc. etc.) and extendable by a plug-in type architecture. It comes from applications who assign it based upon given criteria, or from applications that create files which are now starting to assign more and more metadata to those files. It comes from hardware, like when your digital camera or PVR assigns dates to files it creates. It comes from users inputing it by hand, like when they go through their vacation photos and add a description for each picture.
I use this metadata and perform searches on it every day. Why shouldn't I be able to do an easy search on my computer for every document, application, library, etc. that has the string "vpn" in it? Shouldn't I be able to find all references to MPLS in my files, whether or not they are in in text, .doc, .pdf, or some other file format? Shouldn't applications on my system be able to find and edit this data as well? Well, now I can (and they can) and I really, really like it.
For some reason you are looking at the current limitations of metadata, i.e. optical recognition can't reliably identify my dog, instead of the advantages, which is all the information that can be reliably searched. Maybe right now I can't search for all my mp3's with a piano in them, but I can automatically tag all the audio coming in over the mic I have attached to the piano with metadata that says it is piano. Now fast forward 10 years and suddenly all of your files have a wealth of automatically generated data associated with them. In 10 years I will be able to search for all the mp3's that have piano in them, because my audio mixing program labels all the files with input from the piano mic with the proper metadata and why not. For a few seconds work up front I, and everyone using my files, gets additional functionality. Now apply that to all files from all sources and suddenly metadata has greatly improved the computing experience.
Get with the times, metadata in the filesystem is here and it is very useful and it is becoming more and more useful every day.
Another thing I thought sounded cool was the ability to cat /home/foo/music/some.mp3/raw > /dev/dsp and the mp3 would just play by using a plugin that ran it through an mp3 library. This would allow application developers to just access file/raw rather than worrying about file types and conversions.
If I'm writing an image viewing program I no longer have to worry about hooks to libjpg, libungif, libpng, libevery image file type available. Let the OS care about file types and let applications deal with raw data and focus on interface rather than file types.
Paying taxes to buy civilization is like paying a hooker to buy love.
Well, that makes a lot more sense now, and surely the poster was just confused rather than malicious.
Linus and I disagree on this point, a pity that.
I just come from a time and place where being objective and modest about your own trade or art speaks far stronger than unmodest self-PR work.
Having read the entire interview, I found nothing in it that made me think of Hans Reiser as engaged in unmodest self-PR work. Contrary to the tiny snippet you quoted, he doesn't slate WinFS. He says that it is doing interesting work. Nor is it particularly immodest to say that his file system is considerably more mature when he's spent almost 10 years more on it than the other.
Reading the article, the parts that you consider immodest seem to me, to be just sincere enthusiasm for his work. And there's nothing wrong with that.
Contrary to the title being "no one cares", I think the replies so far show that people do.
Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
More CPU % time is used because Reiser4 is faster. What should be compared is the overall CPU power needed for a filesystem operation. And even if reiser4 is really using more CPU, remember that the CPU power is growing much faster than hd speed.
Wondering why i am doing so strange posts? I am trying to get a "+5,Flamebait" or "-1,Insightful" rating.
The quote is genuine, but it is from Linus Torvalds. Gee, he really believes that Linux kernel developers do things because those things are right? How'd have thought that?
There's nothing a janitor or garbage collector can teach me or you, or most others on Slashdot.
How about humility?
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.