How To Implement A Database Oriented File System
ALundi writes "A really great read from Andrew Orlowski over at The Register on how Benoit Schillings and Dominic Giampaolo created the 64-bit journaled and attribute based Be File System. Schillings and Giampaolo discuss a variety of design and implementation issues, including data integrity and file system performance. " Interesting in the context of MSFTs plans to
implement a DB filesystem
in future versions of MS Windows.
It's called mysql.
I used Be on and off for about 6 months. Once you get the hang of it(the filesystem), you see the true power- especially with the address book.
If you want some real insights into the OS as a whole, check out the BEOS Bible. not so good if you want an in depth discussion, but for non-kernel hackers it's a fun read and very informative.
Read the section explaining how their address book works. it's really cool.
Looking for Book Reviews? Check out Literary Escapism.
If I remember correctly, Be originally used a "true" database backend as the filesystem, but ran into performance issues compared to the R5 fs implementation. I can't help but wonder how many of these issues were largely due to the speed of the technology used at the time.
I suppose it depends on your application, but I know a lot of web-based platforms already use true db backends (Oracle, PostgreSQL) to handle all data storage, representing a filesystem as a hierarchial set of rows in numerous tables. I've written several applications this way, and am currently working on a content management platform which also uses this model. Need to make a change to the filesystem structure (adding attributes, changing the security model)? Just modify the DB structure and you're done, especially with databases like PostgreSQL where you can use the database engine itself for a *lot* of functions (via triggers, stored procedures, security settings, etc).
As more and more functionality is brought into web-based application environments, I can see the importance of "old style" filesystems starting to fade somewhat for a lot of apps. Yes, they'll still be necessary (the database itself has to reside somewhere, obvisouly), but not in the same way they used to be. Just a few thoughts
The implications for quantum computing are staggering!
A lot of people think journaling is a really difficult, complicated thing. But that was actually the easiest part by far. BFS journaling is maybe a thousand, maybe twelve hundred lines of code it was really not difficult. And people make it into this monstrous complicated thing. But again, we do things like change the disk buffer cache so the 64bit features that were needed to do journaling were supported.
In SGI's XFS, their journaling is bigger than all of BFS!
If its so easy, why don't all file systems implement such goodness? Personally, I'd love to see this everywhere.
The Gardener
--
Does this mean that Apple is finally going to put some kind of reasonably modern filesystem under OS X?
Have they finally seen the true genius behind their own iTunes interface?
Have they finally realized that they will shortly be THE ONLY operating system that still relies on file extensions as the primary way of identifying files?
I truly hope that this snippet is as wonderful as it sounds, as it may finally restore my faith in Apple, as well as cure me of my unhealthy Debian and XFS addiction.Karma: Incomprehensible (Mostly affected by posting at +5, reading at -1, and metamoderating everything unfair.)
Dominic's book, Practical File System Design with the Be File System, is wonderful. I'd never delved into the innards of a file system before. Reading his book was enjoyable and interesting. I learned quite a lot.
Transcript show: self sigs atRandom.
MS is have trouble getting major clients to switch to Active Directory. I think there is little chance a DB base FS will be accepted in any reasonable time frame.
XFS is also "database-like". But BFS seems to be rather more ambititous an effort -- and very intriguing.
This is one of several BeOS features that the Open Source community should reall consider stealing. But let's consider these features individually, with one eye on whether they're likely to achieve acceptance outside the ranks of BeOS enthusiasts. Let's not waste time on wholesale BeOS clones and compatibility layers. Those are exercises in denial. BeOS was a nice piece of work, but it's as dead as CP/M. Deal with it.
Here is a mirror.
Alan Thicke's Journal
My Slashdot ads say "
and the weird thing is that he was not drinking Mountain Dew at the time
~jeff
Recently to take attention away from Be, microsoft has been bragging about there database'd operating system. According to a microsoft rep, "our operating systems have allways been database'd, everything in windows is based on data".
Hacker Media
Aren't filesystems by itself hierarchical databases? What's all the fuss about?
for the majority of databases the data should be moved to the filesystem no the database.
Simple joins, and most of them are can be replicated with links if necessary. Almost all the databases I've seen would lose little from moving out of the DB and into the filesystem.
It doesn't scale to complicated joins and huge datastores with complex triggering but for most stuff it simply isn't used.
Too many developers have the mindset of placing tree based data into RDBMS which adds complexity.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Before I begin: click here if you're a "-1: off-topic" weilding stifler of discussions.
:(
[Note that this is read-only so far. Now say oooooh and back away from the moderate button.]
Now then, on to my story:
So the other day this microsurf's telling me about how NTFS has journaling, started taunting me about my eight-minute fscking those sixty gig puppies, and I admit it kind of had me kind of almost maybe a bit uncomfortable, not envious mind you, just a bit, mostly it was just a little hot in the room is all. I excused myself, saying I had to google, and that I would be right back. Five minutes later, I had my response. (And implemented, too! Download today.)
Moral of this story? Turn off those [domain] tags in your preferences if you haven't yet! It totally ruins my train of thought
--
m iso socially aware artistic geek pen-pal, m or f, in '1337 edu. jazz, poetry a must.
email me (click my user info for addy) if you're interested.
Microsoft has never claimed any of those products were their ideas. Perhaps you're too young to know better rather than simply stupid, soI'll go easy on you. History records quite clearly where each of these ideas came from. The fact is that the original seeds came from different sources but it took Microsoft to create truly mature products from those ideas.
/. biased post and that was likely due to the fact that you're a penguin lackey but at least make an effort to get your facts straight.
As far as a DB filesystem - Microsoft announced that as the eventual path for Windows LOOOOOOOOOOOOOOOONNNNNNNNNNG before BeOS came on the scene.
I realise your post was a typical
The possibilities with the Be file system were pretty much infinite. There were little things in the OS that would show you this, the address book and email "client" being some great examples. It was incredibly powerful. The only problem was when transporting files from one operating system to the other. And, even in this, Be did an admirable job.
My fear, and I think the reality is that Microsoft will not be so kind.
jrbd
And Al Gore invented the internet
Damn right they are! For good reason too: it's cranky and fussy and likes to corrupt itself. When the school I went to threw the Microsoft Official Courseware labs out the window because they were impossible to implement, I got my first taste of why AD as it stands is pretty much useless.
If Microsoft had stayed standards-compliant with open standards like LDAP and Kerberos 5 and so forth AD would be much less of a nightmare than it is now. But no, typical MS, they had to "embrace and extend" it. As a consequence, they have shot themselves in the foot.
This is the reason why most MS shops hold desperately on to their NT4 PDCs even though 2K has NT4 beat nine ways to Sunday. 2K cannot do the old-fashioned SAM-based domain even if you cajole it, beat it about the head and shoulders, or ask it nicely. And for most shops, that kind of domain is all they need.
Of course if they went with Samba they could decommission their old fugly NT4 PDC, heh heh...
Knowledge is power. Knowledge shared is power multiplied.
About time eh? Perhaps next we'll have languages that just deal with persistent data directly, without any 'database interface' stuff to code.
A DB as a file system could still be a mixed blessing though - anyone tried to store code in a database, or other files with lots of different versions that may have different structures? Generally, DBs are weak when it comes to namespaces (like directories), versions (except in some special cases features for time series) and 'schema evolution' (changing the data structure).
Nothing impossible though - I think Oracle were getting there with the Internet File System, so would be nice to have a PostgreSQL FS in Linux to start playing with!
Problem is that file systems don't care about data structure and so don't care about changes to that structure, so you can have half a dozen different versions of your address book floating around
Object-oriented graph system
- need much less overhead than database system
- can hold multiple ways to access one object, easing semantic link
representation
- can be secure by allowing only the workflow links.
It should be organised as a set of object that hold data within them,
with links between them.
One of the areas that I am interested in is using a DB FS with various "helper" apps. These helper apps would provide a way of managing your data. They could be integrated into another app directly or as a plugin or they could be standalone apps by themselves.
One could provide a helper app that allows you to look at the stored files in ways other than your typical file listing. To do this would require various metadata attributes to be associated with the data. Helper apps would be provided for all of the standard applications that read and write to data to the file.
For example, one app could set attributes to categorize data. One could then search for data on your system, and potentially others, much in the same way as you would search for stuff on Yahoo.
Say you are in a rush to finish your taxes but you need to put together an itemized list of business expenses. You have information on them stored in various places including text files, e-mail, spreadsheets, etc. You could use find and grep to go poking around looking for them or you could use a helper app that does a quick search of attributes and presents you with a list of candidates and ca even call up other apps or services to look at the data. Once you have identified the data another attribute can be set that your tax software uses to record it and pull it into your tax forms.
Here's a past discussion and here's how it's done.
Has anyone actually tried it?
"A mind is a terrible thing to taste."
Databases in File systems..wonder why we never need a File System Administrator ? Or is this just a new way to create jobs ???Give me a decent file system which can survive crashes ...
btw, wont databases make the filesystem slow ?
A few weeks ago, a DB filesystem was looked upon by slashdoters as the dumbest idea ever because MS wanted to implement it.
All of a sudden it's interesting, perhaps all OS's should have a DB file system installed by default.
. "Users will be able to start dealing with data as they would with any type of SQL query, so it will no longer matter where the files are located. Organizing data becomes far simpler." ..
Anyway locate ang glimpse always work for me ..That makes more sense than putting a DB in the FS
How does it make it any simpler ? May be faster
It doesn't matter if you're the first ... just as long as you're the last.
Even more than normal Microsoft bashing, this sounds like a huge challenge for MS to get right. I can't imagine that they'll manage to retrofit a DB filesystem and make it peform adequately on the first try. It would be tough enough even if they started with a clean slate and a small, independent team of top talent. Unfortunately, although they do have some first-class developers, they also have tremendous legacy baggage and a group-think culture.
I'll wait for at least Service Pack 2 before I put any real data on an MS DB filesystem.
Unix filesystems lags terribly. They don't store the MIME-type. They don't store the preferred app to open a file in. They don't store metadata like the artist and song name if the song happens to be an mp3. They don't have the ability to add gps postion metadata on my digital camara photos. Searches are horribly slow. All unices use different directory conventions. You can't uninstall apps by just moving its icon to the trash (except on osx). App preferences are stored in all sorts of different ways (except on osx).
Linux is a nice remake of a legacy os, but is hardly the future.
The open source community needs a good object storage to base a more futureproof os on. Badly. (And a way better UI than XWindows can give us.)
Bruno G. Albuquerque and Axel Dorfler are working to make a 100% compatiable clone of the file system that shipped with the later versions of BeOS as part of the OpenBeOS project.
:
They have a fantastic amount of the work done already such as
o Read-only BFS
o Kernel Interface
o Full Attribute Support
o Indexing
o Symlink Traversal
o Queries, full UTF-8 Support, and support for non-indexed attributes.
I believe they are also fixing some problems that were in the original FS.
I am sure they would be glad for some more file system engineers. Come to think of it, the rest of this open source project is going really well too, but as always it needs more programmers....
A great book on the BeOS file system, its design and the issues involved, is Practical File System Design with the Be File System by Dominic Giampaolo. This was the first book I read about file systems and is a great discussion on the types of decisions the BeOS folks made to address the specific users and OS functionality they were looking for. It's not too heavy and is really a casual read in comparison to most other Computer Science books.
Is there any work being done to build a database file system for Linux and other os's? I know I have a hell of a time navigating through my mp3s and it would sure be nice to be able to assign attributes to them(rock+favorite+2002) so you could just search on 'favorites and rock' for instance and have a playlist. A database filesystem should also simplify the task of managing files since you could sort them by adding attributes.
well lets see......Ciro (code name for Win 2k) was begun in 1995, they wanted to have a DB file system in that, that was thier plan and that is when it began.
BeOS began development in 1991. the first iteration of a DB backed file system came out with BeOS 1.0 and BFS replaced in around 1994ish.\
soooooooo.......MS again is a day late and a dollor short.
same with MATURE Office productivity products, since workperfect was mature long befor word 1.0 was released, as was lotus spread sheet.
perhaps they did good work on the OLE crap, but that stuff sucks (imho) and Adobe has much better products to crate desktop publishing work that OLE was intended to help with....not to mention that word still can not do what TEX can do.
I am the Alpha and the Omega-3
"Microsoft announced that as the eventual path for Windows LOOOOOOOOOOOOOOOONNNNNNNNNNG before BeOS came on the scene."
As a longtime student of MS PR, I don't think so. They did mumble something about an 'object' filesystem back in the Cairo days, but they later declared that DFS was what they were talking about.
Note that MS runs (or ran) their core systems on AS/400 and VMS -- after figuring out the profit margin on those boxes, that's likely where they got the idea.
Wasn't it Leo Guibas?
Is not the new ReiserFS 4 (due for release late summer) going to be a DB FS? Check out http://www.reiserfs.org and read for yourselves.
The US has a backdoor visa program for those who are qualified enough (or have the proper political connections, or just pay something like $1M) -- people like brain surgeons and guys who wrote crappy Unix clones have no problem getting in over here. Also, if you are Irish or Mexican, you can live here visa-free for no problem even if you are a wanted terrorist/axe-murderer back at home.
In short -- move to the US. Nobody can beat our visa policy!
MS was so proud of their registry that they thought that they could make the whole file system a part of the registry. It broke under stress of actual use, and now the registry is on its way out, never having been much of a success. I expect that the database file system will do the same for a few generations as well.
nt
D3 and other Pick-like systems have been using filesystems based around a database for years.
Check out this link for more information on the D3 database filesystem. ;)
Not much user, lots of system and iowait, that's what. We run into a whole new realm of needing accounting for these kinds of things.
"An object declared as type _Bool is large enough to store the values 0 and 1." -- 6.1.2.5, C99 standard.
If the data is shared, and you have libraries that are shared, then why not ask the data to display itself (object.display(x);) and have it call to a standard library (system library?) which queries a system properties database object as to what application to display it? Don't actually store the display code in the objects, but have the objects query the system as to what the user has specified to display that type of data with.
Dominic: That's what I mean. Some people are very anal about organizing things in rigid hierarchies and others are 'I know what I want to find'.
I think there is a place for hierarchies, but not as the base organizational method of the filesystem. I would like to see a hierarchy of attributes, or keys, or whatever you want to call them. When you save an object (off the internet, or out of your head), a title is only one possible attribute you want to give it. When I save a pr0n jpg, it doesn't need a damn title, I need to mark what it's a damn picture of (amateur AND cumshots AND redhead)! Perhaps start with people, places, things. Or later in the hierarchy, sound -> music -> various bands as well as various artists as well as various sound effects as well as dates and live or studio, all keyed (so to speak) and queryable. But the hierarchy is for browsing. Just for browsing, because browsing is important (when you want to look at cumshots, you want to look at cumshots, but when you query for cumshots, watersports and lesbians, well that's bloody well what you should get), and micro$oft's nice little explorer looks about right. Although instead of a stupid directory tree, we have a tree of object properties and types, and any object can be in any number of places in that tree, depending on it's attributes (categories?).
I know of course that I haven't really said anything new in this post, and I know that performace needs to be taken into account. This is, however, the way things are moving, and all we really need is a really good, really fast, solid state storage medium. When permanent storage is as fast as or faster than RAM is today, the database filesystem will finally become a reality, until then, we'll sure be gearing up.
Cheers, Joshua
When in danger or in doubt, run in circles, scream and shout!
M$ sort of got there before Be. NTFS uses a btree database, but that doesn't really add much more functionality than FAT really. Back in the 'old days' I remember reading that if you had more than 200megs of disk space you needed 5megs of ram just for the filing system, seemed a bit much at the time.
There are brute force ways of doing it: you build some kind of ad hoc database system and dump it into kernel space. You may be able to engineer such a system reasonably well, but to me, it is in bad taste: indexing is such a complex and application dependent area that nobody can guess ahead of time what kind of indexing people will want a few years from now. The Be file system looks like it's too complicated to interoperate well, and too simplistic to be of much use for anything rather than fairly primitive indexing operations.
A better way of doing this is to figure out a protocol for notification and updates between a traditional file system and user-space database indexing services. Yes, that's harder, but that's what software engineers get paid to figure out. And, as far as I'm concerned, if you can't figure out how to do it right, it's better not to do it at all rather than doing something half-baked.
Newforge has this commentary by James Treleaven about the possible implications to Open Source if Microsoft implements a database driven filesystem.
--It's Pimptastic!--
As the ExtremeTech article pointed out, they are not even considering putting the full-blown SQL Server into Windows. SQL Server is too resource-intensive (it really wants to use all of the available CPU, memory and disk space), too much overhead, and most importantly to MS, too profitable (sales of SQL Server / BackOffice make up about 10-15% of MS's revenue.) There's no reason to bundle it if people are willing to pay a ton for it separately!
As the article says, they're thinking (nothing decided yet) about including MSDE, which is exactly the same as SQL Server 2000, except it is tuned for 5 concurrent users (and hard-limited to 10), the database size is limited to 2GB per database (the same as the Jet DB, aka Access), and it doesn't have the nice GUI admin tools bundled.
Also, the OFS (Object File System) discussed previously probably won't get added either. There's a good reason why it was talked about way back in the Cairo (pre-Win95) days but never implemented - it's really, really hard to do, and it's hard to even convince anybody of its value. (Just look at Be.) Active Directory was originally supposed to be an object store, but I don't think anybody uses it for that (if anybody even uses it at all.)
What probably will be included is an improved version of Indexing Service, which is currently included in Windows 2000 and XP. For those of who are fortunate enough to be unfamiliar with Indexing Service (formerly Index Server), it's an NT service (think "daemon") that periodically scans the file system for new / updated files, and then adds whatever metadata it can extract into a database of sorts, which is then used to speed up searches in the built-in Search dialog on the Start menu.
There are a couple of problems with the current implementation:
So, in summary, MS's plans for the DB-in-the-filesystem look a lot more like Reiser4 than like BeFS or SQL Server.
So that's how early NT4 could manage to trash a file that was read-only.
The PICK OS was an even earlier example of a "database filesystem". I worked with it back in 1981, but it dates back earlier than that.
PICK has an SQL-like language to create reports and search and select databases. The data structures are kept in a "dictonary" file for each file. Records are variable length and seperated by upper acsii characters. (254 for fields, 253 for values (sub-fields), and 252 for sub-values.)
PICK is pretty much considered a "legacy system" by most people any more. (If they have heard of it at all.) It had some features that were far ahead of its time. Unfortuantly, Dick Pick, the creator of the OS, was unwilling to improve on it. So PICK remained in the dumb terminal world, while everyone else moved on. The vendors of PICK made improvements over the years. It was the only thing that kept it even vaguely current.
Microsoft is claiming that you will not need datastructures for their new system. DB filesystems are very dependant on embeded filestructures. It has to be there. I have no idea how they will be able to take a structure as complex as Word or MPEG-4 and make it "transparent" and portable. More likely you will be held hostage to their OS. It will be portable to "upgrades" of the OS and no more. (At least until they make that data format no longer supported...)
"Trademarks are the heraldry of the new feudalism."
It's really not Microsoft's innovation.
IBM's AS/400 (a midrange computer system targeted for commercial use/accounting/warehouse/etc...) is based on an object-oriented database filesystem which is implemented at the firmware level (SLIC) rather than at the OS-level - and this system has been around for about 20 years and IIRC it always had quite good performance.
-arch----
A few words about its architecture, if you're interested...
The operating system (OS/400) itself runs on top of this object-oriented low-level "OS" by calling its APIs - as a result, most parts of OS/400 are platform-independent. If you'd manage to get the SLIC running on another hardware platform, you could probably install a nearly unmodified version of OS/400, and it would do its work.
Actually, I'd call the SLIC code the 'real' operating system kernel rather than OS/400, because OS/400 itself would not work without an apropriate SLIC layer.
Everything on the system is an object, so you'll always have to use the object's methods to perform some operation.
For some applications that may be an advantage, because security is enforced on each object at the firmware level. For other applications it might also be a disadvantage, because you'll always have to use a limited set of APIs for modifying data. That blocks many methods commonly used for writing highly optimized code.
-end arch----
One of the benefits of having a database-filesystem is probably the fact that you do not need to run a database product on top of the OS.
Every object on the system can be backed up and restored in a very simple way. Logical files (multiple logical views of one physical file) can help to keep data management simple and consistent.
On the other hand, you will have to update the entire OS (including the kernel) when you need to install a new release of the database - which means, that you'll have to reboot the machine.
And - last but not least - the more code you have in the OS kernel, the higher is the probability of having dangerous bugs somewhere in the kernel.
It should not be necessary to mention, that bugs in the OS kernel may compromise all system security.
There are certainly many advantages and disadvantages regarding the database-filesystem issue, so I think it all depends on what you want to do with your computer.
-----
kind regards from Austria,
octogen
PS: i hope my english isn't too poor..
And - by the way - even Microsoft uses AS/400 boxes for running its business, so what do you think, where did they get their inspiration from...?
This whole discussion is entirely wrong in its direction. While the rest of the world is moving towards managing data in a user space, world readable, flexible format that is xml, microsoft is yet again going backwards into proprietary extensions and api's that aren't transferable.
,XQuery, Schemas and xml libraries in general makes me confident that in two years using a xml as a primary data store as well as programming interface will be a breaze. Think about it, what is really missing from xml that a relational database has right now? Basically some indexing scheme and a good api to handle locking and concurrency, other than not really a whole lot. Throw in a little client server and you're done. Now once you've gone that far, what does an object data base have that an xml database doesn't? Not a whole lot, throw in some XPointer stuff, and you've got references nailed.
Sure there might be some speed advantatages in certain places, but that will in no way make up for the fact that you're data will be burried deep inside the os, as opposed to freely available as it is in xml.
The progress of XPath
Pretty much anything that can be locked away in an os can be done better, and more flexibly in user space. That is why unix is better than vms, multics, or windows or whatever mainframe os, not because it has more features or higher speed, but becuase of it's light and flexible api. Files are stripped bare of anything more than the bare minimum. That keeps things flexible and easy, everything else is moved into the library.
Look, there's a lot of good stuff in BeOS, and a lot of us would like to see it preserved. But it's the technology that's worth preserving. The platform is just a way of delivering the technology. If you insist on totally re-inventing the platform, either as an OS or as a compatibility layer, you force all your potential users to start over completely. And they just won't do it. You want proof? OS/2. DR DOS. NextStep. AmigaDOS. All of these were impressive products. I personally would prefer any of them (or BeOS for that matter) to Windows. I might even prefer some of them to Linux. But like most users, I have to concentrate on platforms that have a real user base.
?K
Why are you letting these clowns ruin our country?
[Damn, I hate IE's text edit bugs.]
Dominic, we are delighted to learn, has subsequently joined Apple as a file system engineer.
This is the best news that I've had in a while on the Mac OSX technical front.
Allright, Dominic! Get in there and kick some ass!
Why are you letting these clowns ruin our country?
The indexing system in W2K is just about useless. I have it turned on and when I ask it to find a file given it's name it still takes over a minute to return answers. On linux or freebsd type in "locate filaname" and you get your answer instantly. I don't know what it does but it sure as hell can't find files fast. Like almost everything MS makes "it makes a lot of promises but delivers almost nothing"
War is necrophilia.
All of that means that all corporation will eventually be forced to migrate to AD whether they like it or not. How corporations pay to get their options taken away from them and make themselves bitches for MS never ceases to amaze me. The CIOs of america are awfully fond of saying "thank you sir may I have another!".
War is necrophilia.
(* I prefer object-oriented graph filesystem *)
I agree that graphs are superior to trees in most respects, but why OO?
What specifically will that get you over a relational approach? OO approaches tends to tie stuff to a particular language because it over-integrates behavior to attributes, and behavior tends to lock you into a language unless you settle for a lowest-common denominator, which does not give you much besides a typical, bland API.
http://geocities.com/tablizer/sets1.htm
Table-ized A.I.
Active Directory *is* LDAP, and they use standard Kerberos for authentication. You can use an OpenLDAP client to query every bit of information a Windows 2000 domain controller holds, and use MIT Kerberos to authenticate against it.
Where is that standard-incompliant?
This sounds a lot like something I've wanted to do for years. Do you have any links to projects like this?
I'm helping with something which could be used to implement what you describe: Coldstore.
I'll go a step further and say that I think there's something fundamentally wrong with Microsoft's free text search products. Index Server sucks, and even their public product support and MSDN search engines can't return consistent results.
My guess is that the field is patented to all hell, and MS rolled their own rather than buying the tech needed.
Close, but replace "filename" with "attributes". On most Filesystems, an Inode is used to access the file. The filesystem also stores attributes like date created, and write permissions. To transfer a file from a Mac to a FAT based MS OS, you need to package the file to retain the metadata, as the Mac metadata (attributes) are more expressive than the FAT filesystem allows. This is no shortcoming of the Macintosh, just an unfortunate result of MS-DOS FAT being considered the standard Lowest common denominator Filesystem. To transfer a files from Unix to a FAT filesystem inflicts similar metadata loss, including multiuser data. This does not mean that FAT is superior, rather, the contrary. FAT is not the most restrictive filesystem either, as at least it has file Hierarchy data (directories or folders).
Note that MS has remedies their shortcoming with NTFS, which is more expressive than many Unix filesystems, and is fully capable of maintaining full HFS Metadata. this is why Services for Macintosh (or whatever MS calls it) requires an NTFS FS to run. metadata is much more elegant than "structured files", which seem to be what you might prefer. A big downside with structured files (like the ID3 tags in MP3 files) is that if you do not know the predefined format for the file structre, then you cannot access the metadata. this prevents the useage of a standard systemwide metadata store, which can be very useful in GUIs and multiuser systems to say the least.
-castlan
gots to preserve my mods
Go cry me a fucking river, loser.
(Note: I spelled 'loser' correctly)
Why do all the Indians always order Mountain Jew?
http://www.namesys.com/whitepaper.html
"The Naming System Venture" by Hans Reiser (of ReiserFS fame)
Oracle has a database based filesystem Internet File System/IFS that is pretty sweet.
Supports protocols HTTP, WebDAV, SMB, FTP, IMAP, and SMTP gives the ability to store, manage, and search documents, presentations, multimedia, Web pages, and XML files.
Also do Check-in/check-out, version control, and event notification features. Unlike most of the examples i have seen developers can extend these base features to build custom content management applications.
members are seeing something, your seeing an ad
The comment by Dominic below: ...you do a case insensitive substring match aand a B-Tree is useless for that type of query.
is wrong. That's a compare function not a b-tree function. A b-tree has no notion of greater or lesser or equal, it only knows what the compare function returns. The compare function can return case-agnostic results so that "Firetruck" matches "firetruck" and the b-tree algorithm takes the comparer's word on that and says they are the same too.
(Dumass!)
Dominic: Can Microsoft make a database file system fast? It depends on what they're trying to do, I think, is going to be the answer. It's very tricky.
It's got to be appropriate for the things that you do. Benoit is right about B-Trees, really, in that they're a stupid data structure for most cases, because 99 per cent of the time when you do a Find, you do a case insensitive substring match aand a B-Tree is useless for that type of query .
Why would patents stop MS? They have stolen technology before. They simply drag the case out till the other company is almost bankrupt and then offer a settlement for pennies on the dollar. Remember this is a company which bitchslaps the dept of justice around like a two dollar whore. You think they are afraid of the patent office?
War is necrophilia.
they are not even considering putting the full-blown SQL Server into Windows. SQL Server is too resource-intensive.
I find this very hard to believe. True, SQLServer eats up everything you feed it, but stays away from the stuff you don't give it: if you forbid it to use a certain CPU, it won't use it; if you tune it to only use XX% of the memory , it will only use that amount of memory (and databases use memory mainly for cache).
they're thinking (nothing decided yet) about including MSDE, which is exactly the same as SQL Server 2000, except it is tuned for 5 concurrent users (and hard-limited to 10)
MSDE is limited to 5 concurrent transactions, not users. So every 6th transaction (i.e. a T-SQL command) will get queued and has to wait for an empty slot. This will degrade performance but you can hook up 30 or more users on a single MSDE powered database without noticing it.
Your point about 'including' MSDE and not SQLServer is weird: both share the same codebase and both eat up everything you give them. MSDE too eats up all the memory it can get, even for 5 transactions. Also: a DB Filesystem eats resources, plain and simple, because you have more data to store plus you're not just serving files but VIEWS on bits of data. So caching has to be intensive. Yes this costs memory and perhaps a lot of diskspace.
As for Index Server: Index Server is ment to index textdocuments and documents of a certain format (office docs etc). It's a hog, indeed, but it also doesn't need much maintenance. It does it's own housekeeping, indexes files by itself. Asking a query by the windows search is a bit slow perhaps, but I've never had any performance problem when using indexserver as the searchengine on static websites.
Never underestimate the relief of true separation of Religion and State.
As I see it, working with OS/400 is as if you're inside oracle or SQL Server: A table in these DBMS'es is a file in OS/400. Nothing special about that.
What's the point in MS' DB filesystem is that you're not INSIDE a database, but ON TOP of the database. So you handle files and bits of data as you are doing today but they're not physical files but views (or query results as you wish) on the data in the database. That's something different than a filesystem with metadata connected to the 'files' like in the article, or an OS that is just a DMBS like OS/400.
You could simulate the behaviour today if you store different pieces of a file in a database (SQL Server, Oracle) as blobs with metadata stored in related fields/tables. Query on the database, f.e. by selecting blobs based on a certain where clause on the metadata, and build some component that will reconstruct a file from the blobs. _THAT_'s what's all about.
Never underestimate the relief of true separation of Religion and State.
it's spelled "looser", loser.
Have you actually done this? I'll admit I haven't, and have been wary to do so since I keep hearing the same story: AD is not 100% LDAP, and the authentication scheme is not 100% Kerberos. If you actually have tried this, I'd like to know how you managed to pull it off properely where dozens of others failed..
Learn from the mistakes of others. There isn't enough time to make them all yourself.
s/terrorist/freedom fighter/g
It's a matter of perspective...
To clarify: they're thinking of "bundling" MSDE, not "integrating" it. So it wouldn't be installed by default, any more than IIS. As for the resource issue, of course you can tune SQL Server the same as MSDE, but how many users (esp. Windows users) are willing to "tune" their filesystems?? As you pointed out, Index Server's one saving grace is that it's very easy to maintain. Now if they could only get rid of all those memory leaks, improve performance, and perhaps index a bit more metadata (cough cough exemelle cough), then it might actually be worth re-enabling on my machine...
This is due to a technical problem with the current implementation of the Windows shell. When you ask the shell to find a file, it will only use the index as long as it is not out of date. If the index is out of date, the shell will do a regular search of the filesystem. Unfortunately, as soon as a write is performed to an indexed file, the index is considered out of date until the next indexing operation.
Before you start griping about "locate" under Linux, remember that locate will not necessarily find a file that has been created fairly recently. For whatever reason, Microsoft didn't think that this was reasonable behavior in the Windows shell, and thus ran into this problem.
I agree that searching for files under Windows is so slow that I find rapid clicking around in folders yields a better average success time. I have to admit that I cannot countenance using a *nix box without locate. Especially when piped through to grep. Want to find the executable for application foobar? locate foobar | grep bin. Result comes back after a few milliseconds. Locate actually builds a database, usually rebuilt 4am each morning by cron, which is the reason it is so fast. If you've just installed a package and need to find where it has just put something then type "locate -u" to update db (though my sys admin friend tells me off for doing this as there is apparently a better way, my way works fine though) and then use locate as usual.
Phillip.
Property for sale in Nice, France
AD is a veneer of LDAP compliance. It needs the ADSI libraries to 'use LDAP'. If you connect from a Linux (or other OS), you have to jump through numerous hoops to do it. It implements LDAPv3 like it does all of its standards based protocols. It always sort of implemented. It does the bare minimum to say "Hey, we support LDAP v3". All I can say is look at Microsofts POSIX compliance.
Time passes. Sun drops the ball on Solaris x86, and consulting companies like Red Hat and SuSE start attacking Linux's shortcomings. Finally I get hired by a company that wants to extend its Windows development tools so they also target Linux. This forces me to play with Linux some more, and I get a more positive impression. I even think it might replace Windows for my day-to-day work, 'cause Windows has gotten too bloated, brittle, and inflexible, and Linux is getting more mature and easy to use. Best of all, Linux is, by its very nature, unmonolithic and open to experiment.
Alas, desktop Linux hasn't lived up to its early promise, though I think that might still happen.
More time passes. I still work with Linux, but for various reasons I still spend most of my time on Windows. Our Linux tools aren't as successful as we'd hoped, but they're still pretty succesful. And now we're also gonna start targeting .NET and Symbian. I'd like to avoid getting into these platforms, lest my brain explode. But I'm responsible for documenting core features of our cross-platform API, so I can't avoid it.
My history with Java is similar. I actively avoided it for a long time, then got a job where I had to learn it. Currently I write the odd web applet for friends.
Bottom line: I can't jump into every technology that looks interesting. Too many constraints. Finite brain capacity. The need to earn a living. The desire to work on things that people will actually use. The business decisions of the people who sign my paycheck. If you want me to use your technology, find a way to integrate it with platforms I already use, or be ignored. Not fair? Perhaps. Life is often unfair.
And as I said before, I'm actually more flexible than most computer users. It may be frustrating and boring to cater to their limitations. But if you don't, all the work you're putting into Be technology will only be seen by other Be enthusiasts. Not that you have any obligation to the rest of us -- but it strikes me as a nasty waste of human potential.
The Apple Newton is another machine with a database "filesystem". Instead of a traditional filesystem, it uses freeform object databases. The slickest thing about them are the way they integrate with NewtonScript (the high level Newton programming langauge), and the way they handle integration between multiple stores.
To appreciate how well the databases (called "soups") integrate with NewtonScript, you'd need to understand a little about it. NewtonScript is an object-oriented langauge based on parent- and proto- inheritance, rather than class-based inheritance (there are subtle differences, but the basic point is that it is object oriented and supports inheritance). The NewtonScript objects (called "frames") can be written directly to, and read from, the soups. Instead of going through a serialization layer, where the object is "flattened" for writing to the database, you can just feed the frame directly to the soup. You don't need to define the format of the frame ahead of time, either, except for the slots (object members) that you will be indexing. This means you don't have to write identical cookie-cutter objects to the soup.
Newton soups also handle multiple storage devices in a slick way. Each storage device on the newton is considered to be distinct (i.e.: you have the internal store, plus additional external stores for each memory card you stick into the Newton). On a low level, a soup exists on a single store. However, you can have identically named soups on several stores, and they will be automagically combined at a high level into a single "union soup". This happens transparently, even at the query level. If, for instance, you've queried a soup and gotten back a cursor (a regular database cursor, i.e.: effectively an array of frames from the soup), this cursor will always reflect the contents of the union soup you're querying in real time. If you eject a card that contains some of the entries in your cursor, they will disappear from your cursor. If you stick in a new card that has entries that would match the query that returned your cursor, they will appear in your cursor.
This, combined with a rather free form linking mechanism between frames, made the Newton "filesystem" a pretty interesting and powerful database.
... and not just dev nodes in the filesystem.
EROS claims to checkpoint the memory of the system - its running state - rather than have programs explicitly serialize state to save and so on. By doing this they can make a machine shut down and reboot in a matter of seconds into a consistent state with your programs apparently still running. This is different from hibernation - you can take the machine down at any time by pulling the plug and this will work.
Its a different axis of filesystem from the 'store more metadata' stuff in this thread, but IMHO its a more interesting way to go. EROS's storage is transactional - to the extent that what is on disk always presents a consistent picture of what was in memory; user-level transactions are dealt with in programs. Its unclear from the register article whether the Be filesystem had this level of integrity.
-Baz
It's worth noting that not all of the OpenBeOS team think in terms of "Give me Be or give me an abacus!" In particular, the file system team makes a point of describing their work as "a kernel add-on". Well, I guess that's architecture they inherited from BeOS. But one member of the team, Will Dyson, is also working on a Linux driver. Which I intend to try out at the first opportunity. The possibilities of integration with KDE are very intriguing.
If you're a fan of the Be file system (and I may be turning into one myself), note that you don't have to use BeOS to use the Be file system.
In anycase, IFS is not a workstation filesystem, like BFS. Strictly for servers.