Slashdot Mirror


How To Implement A Database Oriented File System

ALundi writes "A really great read from Andrew Orlowski over at The Register on how Benoit Schillings and Dominic Giampaolo created the 64-bit journaled and attribute based Be File System. Schillings and Giampaolo discuss a variety of design and implementation issues, including data integrity and file system performance. " Interesting in the context of MSFTs plans to implement a DB filesystem in future versions of MS Windows.

14 of 232 comments (clear)

  1. Database-like by fm6 · · Score: 5, Insightful
    It's important to note that they ended up with something "database-like" rather than a true relational DBMS. That distinction is often overlooked (not least by MySQL enthusiasts!) and is pretty important. The thought of a workstation file system that has all the performance and maintenance issues of a "real" DBMS strikes me as pretty scary.

    XFS is also "database-like". But BFS seems to be rather more ambititous an effort -- and very intriguing.

    This is one of several BeOS features that the Open Source community should reall consider stealing. But let's consider these features individually, with one eye on whether they're likely to achieve acceptance outside the ranks of BeOS enthusiasts. Let's not waste time on wholesale BeOS clones and compatibility layers. Those are exercises in denial. BeOS was a nice piece of work, but it's as dead as CP/M. Deal with it.

  2. Possibilities by Justen · · Score: 3, Insightful

    The possibilities with the Be file system were pretty much infinite. There were little things in the OS that would show you this, the address book and email "client" being some great examples. It was incredibly powerful. The only problem was when transporting files from one operating system to the other. And, even in this, Be did an admirable job.

    My fear, and I think the reality is that Microsoft will not be so kind.

    jrbd

  3. Re:AD doesn't yet have wide acceptance, DBFS doome by MsGeek · · Score: 2, Insightful
    MS is have (sic) trouble getting major clients to switch to Active Directory.

    Damn right they are! For good reason too: it's cranky and fussy and likes to corrupt itself. When the school I went to threw the Microsoft Official Courseware labs out the window because they were impossible to implement, I got my first taste of why AD as it stands is pretty much useless.

    If Microsoft had stayed standards-compliant with open standards like LDAP and Kerberos 5 and so forth AD would be much less of a nightmare than it is now. But no, typical MS, they had to "embrace and extend" it. As a consequence, they have shot themselves in the foot.

    This is the reason why most MS shops hold desperately on to their NT4 PDCs even though 2K has NT4 beat nine ways to Sunday. 2K cannot do the old-fashioned SAM-based domain even if you cajole it, beat it about the head and shoulders, or ask it nicely. And for most shops, that kind of domain is all they need.

    Of course if they went with Samba they could decommission their old fugly NT4 PDC, heh heh...

    --
    Knowledge is power. Knowledge shared is power multiplied.
  4. I'll wait for SP2 by Bowfinger · · Score: 2, Insightful
    Interesting in the context of MSFTs plans to implement a DB filesystem in future versions of MS Windows.

    Even more than normal Microsoft bashing, this sounds like a huge challenge for MS to get right. I can't imagine that they'll manage to retrofit a DB filesystem and make it peform adequately on the first try. It would be tough enough even if they started with a clean slate and a small, independent team of top talent. Unfortunately, although they do have some first-class developers, they also have tremendous legacy baggage and a group-think culture.

    I'll wait for at least Service Pack 2 before I put any real data on an MS DB filesystem.

  5. Unix lags by yggdrazil · · Score: 2, Insightful

    Unix filesystems lags terribly. They don't store the MIME-type. They don't store the preferred app to open a file in. They don't store metadata like the artist and song name if the song happens to be an mp3. They don't have the ability to add gps postion metadata on my digital camara photos. Searches are horribly slow. All unices use different directory conventions. You can't uninstall apps by just moving its icon to the trash (except on osx). App preferences are stored in all sorts of different ways (except on osx).

    Linux is a nice remake of a legacy os, but is hardly the future.

    The open source community needs a good object storage to base a more futureproof os on. Badly. (And a way better UI than XWindows can give us.)

    1. Re:Unix lags by BeBoxer · · Score: 3, Insightful

      Well put. The BeOS filetypeing is far and away the most robust of any OS I've ever worked on. I've got an audio application I've written which leverages attributes as much as I can. One of the neat tricks I can do is that I save playlists as folders. A "playlist" folder contains links to either individual songs or entire folders. It can also contain "query" files which perform searches on the file attributes. I actually create normal folders in the Tracker, and then assign them to open up with my application instead of Tracker. So if you double-click the folder, it runs my app and loads the playlist. If you want to muck with it manually, right-click and open it with Tracker. I don't know of any other OS that gives me that much flexibility.

  6. I used to like this idea. by Oggust · · Score: 3, Insightful
    But there are some very real problems:

    • Portability: Have you ever tried to move data from DB system to another? Not fun! There need to be some standards! There is such a thing as SQL-92, but nobody uses it yet. Yet? Right...
    • Portablility again: Isn't it nice today how you can get a tar file from just anybode, and it will just untar on your system, even if you haven't got the same OS and filesystem as the guy you got it from? Well see the previous point.
    • Performance, but that's possibly not that big a deal: A database can do a lot of work "server side". What do you think will happen to the system load on a big multi-user system when some moron submits a huge SQL query with all kinds of weird joins and stuff?

      Not much user, lots of system and iowait, that's what. We run into a whole new realm of needing accounting for these kinds of things.

    /August.

    --
    "An object declared as type _Bool is large enough to store the values 0 and 1." -- 6.1.2.5, C99 standard.
  7. Re:Integrated database computers: IBM AS/400 by gabrieltss · · Score: 3, Insightful

    Mod this up!

    This guy has it right. I have been working AS/400's for YEARS! The OS is built around the DB2 database. And IBM beat MS to the idea a long time ago. But, MS likes to make everyone think they actually "Innovate" things. When in fact they don't! They buy up companies, steal technoligies etc... The only thing MS has perfected is the marketing engine. And yes it is true MS uses AS/400's. In fact MS wanted me to come interview to work on their team that did the MS--> As/400 integration work. (I told them to blow me, I wouldn't work for a company I dispised!).

    --
    The Truth is a Virus!!!
  8. Re:New FS Engineer at Apple! by Vanders · · Score: 2, Insightful

    Tedious hacks? Tell me, how do you view meta-data in an xterm at the moment? Have you ever done ls -l in a shell? See that extra information, thats meta-data. Why would it be so hard to add a flag to ls so you can view the "content-type" meta data?

  9. Re:New FS Engineer at Apple! by loosifer · · Score: 3, Insightful
    what's so bad about file extentions??

    Ugh, don't get me started... Here's a short list:

    • Only a single piece of metadata can be stored (filetype)
    • Combining data (filename) and metadata (filetype) kind of belies the definition of metadata in the first place
    • It can get very confusing to see things like file.txt.rtf
    • The tendency of systems to want to hide the extension, and the resulting confusion when your FTP client says the file is named stuff.zip but your interface just says "stuff"
    isn't it easier and faster to look at the filename than metadata and file contents??

    Only if the metadata is stored in some other inode.

    In the case of BFS, the inodes are always 1024 bytes, but the inode information is only about 300 bytes; that meant that BFS had 700 free bytes in the inode--free in the sense that they don't take up any more space, but also free in the sense that you don't need to do an extra seek or an extra read to get them. Now that's free! You can go about 700 bytes, but then it needs to allocate more blocks (I don't know if that requires another inode, though).

    Also, in this case, even if it is just slightly faster to look at the filename, you lose every other possible metadata feature, just to get an essentially unmeasurable increase in performance. I'd take BFS and all of its features at half the speed if I could use it on any OS I wanted.

  10. Re:AD doesn't yet have wide acceptance, DBFS doome by Malcontent · · Score: 3, Insightful

    All of that means that all corporation will eventually be forced to migrate to AD whether they like it or not. How corporations pay to get their options taken away from them and make themselves bitches for MS never ceases to amaze me. The CIOs of america are awfully fond of saying "thank you sir may I have another!".

    --

    War is necrophilia.

  11. Re:New FS Engineer at Apple! by curmi · · Score: 3, Insightful
    Have they finally realized that they will shortly be THE ONLY operating system that still relies on file extensions as the primary way of identifying files?

    You gotta love how every Unix/Linux/Windows user now talks about how file extensions are so bad. A few years ago these same people used to say how crap Apple's idea of meta data was, and that file extensions were better (cross platform, etc, etc). The arguments that I've had with Unix people at different places I've worked. It seems to take the rest of the world at least 10 years to catch on to ideas...
  12. Re:Fsck meta-data by castlan · · Score: 2, Insightful

    Close, but replace "filename" with "attributes". On most Filesystems, an Inode is used to access the file. The filesystem also stores attributes like date created, and write permissions. To transfer a file from a Mac to a FAT based MS OS, you need to package the file to retain the metadata, as the Mac metadata (attributes) are more expressive than the FAT filesystem allows. This is no shortcoming of the Macintosh, just an unfortunate result of MS-DOS FAT being considered the standard Lowest common denominator Filesystem. To transfer a files from Unix to a FAT filesystem inflicts similar metadata loss, including multiuser data. This does not mean that FAT is superior, rather, the contrary. FAT is not the most restrictive filesystem either, as at least it has file Hierarchy data (directories or folders).

    Note that MS has remedies their shortcoming with NTFS, which is more expressive than many Unix filesystems, and is fully capable of maintaining full HFS Metadata. this is why Services for Macintosh (or whatever MS calls it) requires an NTFS FS to run. metadata is much more elegant than "structured files", which seem to be what you might prefer. A big downside with structured files (like the ID3 tags in MP3 files) is that if you do not know the predefined format for the file structre, then you cannot access the metadata. this prevents the useage of a standard systemwide metadata store, which can be very useful in GUIs and multiuser systems to say the least.

    -castlan
    gots to preserve my mods

  13. Re:Excuse me but... by rycamor · · Score: 2, Insightful

    And heirarchical databases were replaced in the 70s by the relational database model because it was impossible to effectively deal with data if you are restricted to a simple heirarchical view.

    Here's an analogy: Heirarchical databases are to relational databases as the GOTO statement is to object encapsulation.

    This might explain why some of us get so frustrated at being forced to continually "navigate" up and down our "folder" heirarchies with the "tree widget". That's just a graphical metaphor for 60s computer technology.