How To Implement A Database Oriented File System
ALundi writes "A really great read from Andrew Orlowski over at The Register on how Benoit Schillings and Dominic Giampaolo created the 64-bit journaled and attribute based Be File System. Schillings and Giampaolo discuss a variety of design and implementation issues, including data integrity and file system performance. " Interesting in the context of MSFTs plans to
implement a DB filesystem
in future versions of MS Windows.
XFS is also "database-like". But BFS seems to be rather more ambititous an effort -- and very intriguing.
This is one of several BeOS features that the Open Source community should reall consider stealing. But let's consider these features individually, with one eye on whether they're likely to achieve acceptance outside the ranks of BeOS enthusiasts. Let's not waste time on wholesale BeOS clones and compatibility layers. Those are exercises in denial. BeOS was a nice piece of work, but it's as dead as CP/M. Deal with it.
The possibilities with the Be file system were pretty much infinite. There were little things in the OS that would show you this, the address book and email "client" being some great examples. It was incredibly powerful. The only problem was when transporting files from one operating system to the other. And, even in this, Be did an admirable job.
My fear, and I think the reality is that Microsoft will not be so kind.
jrbd
Damn right they are! For good reason too: it's cranky and fussy and likes to corrupt itself. When the school I went to threw the Microsoft Official Courseware labs out the window because they were impossible to implement, I got my first taste of why AD as it stands is pretty much useless.
If Microsoft had stayed standards-compliant with open standards like LDAP and Kerberos 5 and so forth AD would be much less of a nightmare than it is now. But no, typical MS, they had to "embrace and extend" it. As a consequence, they have shot themselves in the foot.
This is the reason why most MS shops hold desperately on to their NT4 PDCs even though 2K has NT4 beat nine ways to Sunday. 2K cannot do the old-fashioned SAM-based domain even if you cajole it, beat it about the head and shoulders, or ask it nicely. And for most shops, that kind of domain is all they need.
Of course if they went with Samba they could decommission their old fugly NT4 PDC, heh heh...
Knowledge is power. Knowledge shared is power multiplied.
Even more than normal Microsoft bashing, this sounds like a huge challenge for MS to get right. I can't imagine that they'll manage to retrofit a DB filesystem and make it peform adequately on the first try. It would be tough enough even if they started with a clean slate and a small, independent team of top talent. Unfortunately, although they do have some first-class developers, they also have tremendous legacy baggage and a group-think culture.
I'll wait for at least Service Pack 2 before I put any real data on an MS DB filesystem.
Unix filesystems lags terribly. They don't store the MIME-type. They don't store the preferred app to open a file in. They don't store metadata like the artist and song name if the song happens to be an mp3. They don't have the ability to add gps postion metadata on my digital camara photos. Searches are horribly slow. All unices use different directory conventions. You can't uninstall apps by just moving its icon to the trash (except on osx). App preferences are stored in all sorts of different ways (except on osx).
Linux is a nice remake of a legacy os, but is hardly the future.
The open source community needs a good object storage to base a more futureproof os on. Badly. (And a way better UI than XWindows can give us.)
Not much user, lots of system and iowait, that's what. We run into a whole new realm of needing accounting for these kinds of things.
"An object declared as type _Bool is large enough to store the values 0 and 1." -- 6.1.2.5, C99 standard.
Mod this up!
This guy has it right. I have been working AS/400's for YEARS! The OS is built around the DB2 database. And IBM beat MS to the idea a long time ago. But, MS likes to make everyone think they actually "Innovate" things. When in fact they don't! They buy up companies, steal technoligies etc... The only thing MS has perfected is the marketing engine. And yes it is true MS uses AS/400's. In fact MS wanted me to come interview to work on their team that did the MS--> As/400 integration work. (I told them to blow me, I wouldn't work for a company I dispised!).
The Truth is a Virus!!!
Tedious hacks? Tell me, how do you view meta-data in an xterm at the moment? Have you ever done ls -l in a shell? See that extra information, thats meta-data. Why would it be so hard to add a flag to ls so you can view the "content-type" meta data?
Syllable : It's an Operating System
Ugh, don't get me started... Here's a short list:
- Only a single piece of metadata can be stored (filetype)
- Combining data (filename) and metadata (filetype) kind of belies the definition of metadata in the first place
- It can get very confusing to see things like file.txt.rtf
- The tendency of systems to want to hide the extension, and the resulting confusion when your FTP client says the file is named stuff.zip but your interface just says "stuff"
isn't it easier and faster to look at the filename than metadata and file contents??Only if the metadata is stored in some other inode.
In the case of BFS, the inodes are always 1024 bytes, but the inode information is only about 300 bytes; that meant that BFS had 700 free bytes in the inode--free in the sense that they don't take up any more space, but also free in the sense that you don't need to do an extra seek or an extra read to get them. Now that's free! You can go about 700 bytes, but then it needs to allocate more blocks (I don't know if that requires another inode, though).
Also, in this case, even if it is just slightly faster to look at the filename, you lose every other possible metadata feature, just to get an essentially unmeasurable increase in performance. I'd take BFS and all of its features at half the speed if I could use it on any OS I wanted.
All of that means that all corporation will eventually be forced to migrate to AD whether they like it or not. How corporations pay to get their options taken away from them and make themselves bitches for MS never ceases to amaze me. The CIOs of america are awfully fond of saying "thank you sir may I have another!".
War is necrophilia.
You gotta love how every Unix/Linux/Windows user now talks about how file extensions are so bad. A few years ago these same people used to say how crap Apple's idea of meta data was, and that file extensions were better (cross platform, etc, etc). The arguments that I've had with Unix people at different places I've worked. It seems to take the rest of the world at least 10 years to catch on to ideas...
Close, but replace "filename" with "attributes". On most Filesystems, an Inode is used to access the file. The filesystem also stores attributes like date created, and write permissions. To transfer a file from a Mac to a FAT based MS OS, you need to package the file to retain the metadata, as the Mac metadata (attributes) are more expressive than the FAT filesystem allows. This is no shortcoming of the Macintosh, just an unfortunate result of MS-DOS FAT being considered the standard Lowest common denominator Filesystem. To transfer a files from Unix to a FAT filesystem inflicts similar metadata loss, including multiuser data. This does not mean that FAT is superior, rather, the contrary. FAT is not the most restrictive filesystem either, as at least it has file Hierarchy data (directories or folders).
Note that MS has remedies their shortcoming with NTFS, which is more expressive than many Unix filesystems, and is fully capable of maintaining full HFS Metadata. this is why Services for Macintosh (or whatever MS calls it) requires an NTFS FS to run. metadata is much more elegant than "structured files", which seem to be what you might prefer. A big downside with structured files (like the ID3 tags in MP3 files) is that if you do not know the predefined format for the file structre, then you cannot access the metadata. this prevents the useage of a standard systemwide metadata store, which can be very useful in GUIs and multiuser systems to say the least.
-castlan
gots to preserve my mods
And heirarchical databases were replaced in the 70s by the relational database model because it was impossible to effectively deal with data if you are restricted to a simple heirarchical view.
Here's an analogy: Heirarchical databases are to relational databases as the GOTO statement is to object encapsulation.
This might explain why some of us get so frustrated at being forced to continually "navigate" up and down our "folder" heirarchies with the "tree widget". That's just a graphical metaphor for 60s computer technology.