Using Relational Databases as Virtual Filesystems?
"To conquer our fears we're trying to get a handle on exactly what is where, with the goal of reorganizing the true physical locations of data to minimize the business impact if any single NFS server goes down. At the moment, the plan of attack is to construct a relational Oracle 8.1.6 database on linux which will basically mirror the filesystem in a DB. To accomplish this, I'm writing a horde of scripts using the perl DBI which will poll the entirety of the NFS filesystems on our network and create what basically amounts to a virtual filesystem in the DB which we can then drill into for specific information in much less time than it would take us to search through the actual filesystems in question. In addition, we gain the ability to maintain historical data, which allows us, among other things, to know exactly what went wrong if a luser rm's, mv's, or cp's the wrong thing to the wrong place.
Has anyone tried this before? And is this even a good idea? Does anyone know of existing packages that will do this? I'm really curious what the slashdot community thinks of the idea. I was several hours into this before someone said to me, 'Do you realize you're writing a filesystem in SQL?'"
It was an interesting idea. I think that the problem they had in MTS will be the same with your idea: not everything fits neatly into the DB model. In fact, some things really have to be shoehorned in.
The insightful reader will be saying, "But wait! You also have to shoehorn stuff into the conventional FS model." True enough. The question is how much fits naturally and how much has to be shoehorned.
My contention is that the conventional model is a better fit for most stuff. That's especially (perhaps sadly) true because of legacy software that expects the conventional model. Perhaps a ground-up OS and application implementation would be able to rethink some of those issues and find new insights. But I'm naturally skeptical.
There is also the issue of performance. I know little about DBs (my loss), but it seems to me that if the FS is stored in an existing relational system, you're going to have to warp some stuff to make it fit. I'd suspect that either you're going to have to make every file be a different table, or you're going to have to store the contents of every file as a variable-length text field. Either option is going to have really nasty effects on the efficiency of the DB, which has been highly optimized under the assumption that each table contains tons of highly homogeneous records.
I wouldn't want to dive into that kind of can of worms as an "I want to use it in production" project. It might make interesting research on a 5-year horizon, though.