Conquest FS: "The Disk Is Dead"
andfarm writes "A few days ago, I sat in at a presentation of a what seems to be a new file system concept: Conquest. Apparently they've developed a FS that stores all the metadata and a lot of the small files in battery-backed RAM. (No, not flash-RAM. That'd be stupid.) According to benchmarks, it's almost as fast as ramfs. Impressive." The page linked above is actually more of a summary page - there's some good .ps research reports in there.
this is great. We all have seen this coming, but how is the industry going to take this and implement it. My bet is it won't. The only way that it will take hold is if you can find some small application that will take and apply it.
One quick draw back I see in this system is on a computer where you have more small files than available RAM space. How does the system decide which small files to keep on the regular disk and which ones to keep in RAM?
wow... thats all i have to say, something like this could make waiting over a min or two to boot totally obselete... sort of like a "turn on" welcome to your OS of choise type of thing... i also tons of other possibilities such as high end graphics work and maybe even phasing out the disk as we know it 100%.... all solid state.. the possibilities are endless
The idea of RAM as storage is great and all, but can we work towards the elimination of STORAGE as RAM before we get to RAM as storage?
I mean, why *DO* we still have pagefiles?
A MS Gripe: I seriously don't understand why I can't turn it off completely. With multiple GB of RAM dirt cheap, writing to a disk pagefile slows my system down-- It has to!
Execute in Place (EIP)- currently, your system will copy the program to RAM. Here, you'd copy everything from volatile ram to Non-volatile ram - a rather wasteful operation don't you think?
This is not just for exe's but for datafiles as well...
Slashdot's rate-of-post filter: Preventing you from posting too many great ideas at once.
This is 'stepping-stone' technology, along the same lines as hybrid gasoline/electric vehicles. They're still depending on hard drives for mass data storage. It's just the executables, libraries, and other application-type goodies that they're sticking into RAM.
You can do exactly the same thing by sticking an operating program into any sort of non-volatile storage (EPROM, EEPROM, memory card, whatever), and including a hard drive in the same device if need be. The new filesystem they're describing simply shifts more of the load to the silicon side instead of the electromechanical realm.
In short; The Disk is far from dead. This is just a first step in that direction.
Bruce Lane, KC7GR,
Blue Feather Technologies
http://www.superssd.com/products/tera-ramsan/
Up to a terabyte even.
-n
http://www.remix.net/
God is dead. (Neitzche) Tape is dead. (Innumerable pundits) Disk is dead. (Conquest FS) Yet somehow, they all seem to be alive and kicking as of right now. I wouldnt be throwing a wake just yet for any of 'em.
I wonder if a kernel could realize many of the same performance benefits with current filesystems by identifying directory inodes and small file inodes and lowering the probability of those falling out when it's time to free pages.
What do you mean you can't turn it off? I haven't had a pagefile since I hit 1GB of RAM in my desktop. The Windows XP option (System Properties, Advanced, Performance Options, Advanced, Change) is even called "No paging file".
They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety.
a very aggressive caching algo? I mean other than the battery backed part. You should be able to attain similar performance benefits using a purely software solution assuming your app doesn't do a lot of "important" writing to "small" files (where Conquest would do it all in RAM and still be able to persist it). But things like dll's,exes and whatnot don't change.
.ps files to check).
I guess I can understand the benefits (as minor as they may be relative to price), but the thing that bothers me the most is why does it take 4 years and NSF funds to come up with something that seems so obvious?
And one major problem would be getting over the fact that if the machine craters, you can't just yank the drive and have everything there, though I assume they have some way to "flush" the ram (can't read the
Pardon my ignorance, but what happens if the battery fails? Of course, this is highly unlikely, but just a scenario.
In a conventional disk the data would remain even if power is switched off, but a RAM would lose the data (or get corrupted or cannot be sure if the data is exactly the same).
Thank you.
GrimReality
2003-04-21 15:51:18 UTC (2003-04-21 11:51:18 EDT)
Even with tons of RAM pagefiles are a GOOD thing and if used properly (Even MS uses them properly these days) they speed up the system on the whole.I have done a *little* OS development so I may not be an expert but I do have an idea what I 'm talking about.
.... wait this sounds familiar ;)
... but then again I could just be not well enough informed (a little knowledge is dangerous ;) and rambling like an idiot.
The swapfile is where the OS puts things it hasn't used in a while. On windows this would probably include things such as the portions of IE that are now part of the OS and you are forced to have loaded even if you are not using the box for web browsing. Having placed these items in the page file frees up room for things that are currently usefull such as IO buffers/cache (disk and/or net) that can dramatically increase speed by storing things such as recently used executables, meta-information
That being said I think the technology discussed in this article is a bit too single minded. I think adding an extra level in the storage heirarchy between main ram and non-volitile HD is probably a good thing. My idea is to add a HUGE pile of PC100 or similar ram into a system and have this RAM accessed in a NUMA style which is becoming very popular. The nintendo GameCube uses a form of this aproach, there are two types of RAM with a smaller-faster section and a larger-slower section.
The problem with my idea is that the price difference b/w cheap-slow RAM and fast-expensive RAM is not enough to make it worth the extra complexity currently. But, I would guess that if someone took the effort to design/build cheap slow RAM they could find a niche market for a system accelerator device
Thoughts on tech, Software Engineering, and stuff
My raid controller basically does this already..
It's an old IBM 3H 64 bit PCI model with 32MB of ram and battery backup.. newer 4H models support more ram.. but how is this any different?
The most used and smallest files stay in the cache.. the rest are called when needed.. and if god forbid the power fails, and the ups fails.. the card has a battery backup to write out the final changes once the drives come back online.
For those who are not PS worthy.
The paper
Looks like a great server side file system. This is finally a step away from this whole "file" madness. All storage and IO should be memory mapped, and all execution should be in place. Anything else is just silly.
[-- Trust the Monkey --]
Dead? I don't think so. Get back to me when they start using Non-volatile RAM and the price per byte is equal or less than the price for harddrives. Until then, the HD is going to be alive and kicking.
One thing I've always wondered though. Why not release an OS on an EPROM? It would make boot time and OS operations extremely fast. I'm still surprised to this day that this isn't mainstream. Ahhh, the good ol' days of Commodore when you OS was instantly on when you turned on the PC.....
It's better to burn out than to fade away
Though I don't think it's a useful general-purpose concept to have a RAM-only FS, I'm hoping that fast RAM will catch up to magnetic disks in size. A standard FS/VM will end up caching everything if the RAM is available. I seem to recall that ext3 on Linux, if given the RAM for cache, is faster than many ramfs/tmpfs implementations. Plan9 completely removes the concept of a permanent filesystem versus temporary memory. Everything is mapped in memory, and everything is saved to disk (eventually). It's a neat concept, and it happens to go very well with 64-bit pointers and cheap RAM.
I'm hoping that hardware people will realize that we need huge amounts of fast memory...whether or not we think we need it. We're stuck in a "why would I need more RAM than the applications I run need?" kind of mindset. I think that the sudden freedom 64-bit pointers will provide to software developers will result in a paradigm shift in how memory (both permanent and temporary) is used. Though like all paradigm shifts, it's difficult to predict ahead of time exactly what the change will be like...
Note that hard drive failures are still common and likely to be much more common than a battery failure, as it would be trivial to implement a scheme through which batter recharding would be automatic while the computer was plugged in. The battery would only be directly employed when the system was unplugged or the power was out. Even in that case it would be also trivial to implement a continuous/live backup system to a nonvolatile media like a hard disk, which by that point would be ridiculously cheap.
First, the full title of the talk was "The Disk is Dead! Long Live the Disk!" We make no claim that disk manufacturers are going to go out of business tomorrow; history suggests that the technology will survive for at least a decade, and probably more than two. Talk titles are intended to generate attendance, not to summarize important research results in 8 words.
Second, the most common objection to the work boils down to "just use the cache". This point has been raised repeatedly on Slashdot over the past few years. However, if you read our papers or attend one of my colloquium talks (UCSC, May 22nd -- plug), you'll learn that LRU caching is inferior for a number of reasons. We were surprised by that result, but it's true. Putting a fake disk behind an IDE or SCSI interface is even worse, since that cripples bandwidth and flexibility.
Third, for people worried about battery failures, the only question of interest is the MTBF of the system as a whole. All systems fail, which is why we keep backups and double-check them. If your disk failed every 3 days, you couldn't get work done, but there was a time when we dealt with a failure every few months. Conquest's MTBF hasn't yet been analyzed rigorously, but I believe it to be more than 10,000 hours, which is good enough to make it usable.
Finally, I have chosen not to put my talk slides on the Web, at least not for the moment. But you're welcome to mail me with questions: geoff@cs.hmc.edu. It might take me a few days to answer, so be patient.