A Semi-Radical Approach To Avoiding fsck
Dru writes: "This is an article about a hardware technology that is largely unknown
in the new Unix community. In theory, with this inexpensive hardware, your BSD or Linux box could start doing
guranteed
reboots in under 2 minutes (no fsck required) and super fast database writes. It could leapfrog all of the journaling filesystem projects as well.
Yes, I wrote the article. The article is long, detailed, and mentions FreeBSD often. However, I do believe it
is relevant to any other PC Unix. If enough people learn about it, maybe they will start demanding
it from their favorite hardware vendor." With RAM and hard drive space both continuing to decline, I wonder how the speed / use curve for individual PCs' storage (from L1 cache to backups) will evolve. With a similar bent, Arek urges you to "take a look at our company's
Solid State Disk Drives." How'dja like 8 or so gigs of DRAM next time you edit a video or burn a CD?
Sure, a battery backup sounds like it solves this, but consider that DRAM stores its charge on tiny capacitors, and requires a controller to be performing "refresh" access cycles regularily (usually every 15.26 s). This means that not only must the battery be good, but the controller accessing the DRAM must continue providing the refresh cycles without interruption. That may sound simple, but not all DRAMs are created alike.... SDRAM DIMMs have a feature called Serial Presence Detect (SPD) that is a small non-volatile 2-wire serial EEPROM memory that hold identifying data about the size and timing parameters for the memory. A typical DRAM controller would be initialized at boot time... a card like this would require a special DRAM controller that only initialized its timings when the DRAM/battery is first installed. Perhaps the controller would be designed to use relatively slow and conservative timings, always, so it'd never be able to reinitialize to other settings (that could be wrong) and/or stop providing the critical refresh at any point.
The point is that to retain memory, DRAM requires not only power but a properly operating controller to supply the refresh cycles. Magnetic media maintains its memory without either of these conditions. Compared to magnetic media, DRAM is very volatile. "Mission Critical" data, whatever that may be, would be existing at tiny charges on the very tiny capacitors, which could dissipate in only about 4-8 ms, if the DRAM controller doesn't perform perfectly.... inside a computer (designed as a reliable server) which has just crashed for some unknown reason!
PJRC: Electronic Projects, 8051 Microcontroller Tools
This has been talked of for quite a time, and is hardly radical. Whats more it is not an alternative to journal based filesystems, but logically its an adjunct to them.
:-) ]
First you have your filesystem that buffers transactions in a journal that is streamed to disk. Then, for performance, by avoiding all those extra seeks, you put the filesystem journal on another device - say a small fast dedicated disk. Then you make that device a NVRAM device rather than something based on spinning rust.
Whats more, if you are interested in something like mail systems, where you get a lot of transactions that *must* committed to stable storage (although a lot of MTAs don't do that in spite of the wording of RFC821), and you use a fileystem like ext3 with a data journalling mode, then putting the journal onto NVRAM makes a huge difference - by the time it comes to the point where data would be committed to the disk from the journal, most of the data (ie e-mail messages) is now unwanted (since the messages have been delivered to final local destination or for onward transmission) and so you don't even need to do the disk ops...
All of this is pretty much available now in ext3 other than the tools to get the journal onto a NVRAM disk - and thats just detail.
So, nice idea, needs more flesh, a little more infrastructure needed round it.
[Those who came to the London UKUUG Linux Conference might well have heard these discussions before going on in various corridors