WinFS' Spot on Back Burner Nothing New
osViews.com writes "Charles Arthur of Independant.co.uk has an interesting editorial which analyzes Microsoft's recently postponed 'WinFS,' the file system that Microsoft had been planning to implement in Longhorn. His editorial reminds us that this technology, previously referred to as the 'NT Object Filing System' was intended for a previous version of one of Microsoft's operating system's code named 'Cairo.' Microsoft first spoke of the 'NT Object Filing System' in 1992 and scheduled a beta release in 1996 and then a full release in 1997. But limitations cause it to continue being delayed."
...is a solution in search of a problem?
It reminds me of the old saying
Or for that matter the ORIGINAL goal of the Gnu project?
What's your point here? Why are you trying to bash Microsoft just because they decided to delay or abandon something?
Best Buy can have you arrested
duke nukem forever and winfs are fighting for the throne... of... stupid delays
http://ipod.fresh27.net/
To how many of Mircrosoft's MILLIONS of consumers, is a filesystem like 'WinFS' (theoretically) a feature to be desired?
Most people I know want eye candy, and things to work as they're used too.
Microsoft doesn't _need_ WinFS, therefore it's not a prime concern
Error 407 - No creative sig found
People COULD just use naming conventions and name their files according to the content. But I guess that's just too hard.
Vaporware. Microsoft is so famous for it, they are referenced in the definition.
Is there any project for a similiar file system in linux?
The idea itself is a good one.
Microsoft, I must applaud you. By delaying the best features of your operating system, and assuming you continue to do so in future versions of Windows, you wiil, one day, have the best OS to have never been developed.
From what I know of WinFS, it really won't be all that important anyway. It is supposed to provide a way for all files to be treated the same by the OS (roughly) right? Thus making it easier for users to search, browse, or otherwise find these files?
Well, I don't know all of the juicy details of WinFS but I have played with the new Longhorn build. The search tool that is in the Alpha release (MSDN) is much improved over the current WinXP search. It was pretty cool, although some of it can be chalked up to eye candy. It still had a certain ease of use to it.
I doubt WinFS will ever be complete, personally. But I am sure some of the innovation and development benefits will still reach us as consumers. I know where I work, we spend time doing things the customers will never see. But they will still reap many of the benefits.
I'm working on a object file system right now, and it's really not easy.
It's a simple concept:
Store on a standard journaled b-tree (or similar) filesystem the binary data, and store in a database all sorts of meta-information about the data. Also if you want, store a reverse index of the textual info and maybe another 'index' of image features if it's an image.
Then if you want to get anything, no need to go through the filesystem's tree, you can hit the DB indexes and get info instantly.
The real problem is keeping all of this in synch, with almost flawless atomic operations. (of course it's pretty much impossible to be flawlessly atomic, but one should come as close as the current journaled filesystems are).
So if you're using 2 components, let's say, a filesystem and a SQL database, then you need to open a SQL transaction, do your inserts/updates/deletes, then do the filesystem operation, then do the SQL transaction commit. If anything fails, you can revert the SQL modifications and everything goes back to normal. But if the filesystem has problems, then you can't keep the damn DB synchronized, and at some point you'll have to resynch both.
On 100k files, no problem. On 200MM files (what I'm aiming for), you're pretty much screwed. Then you have to start thinking of a self-healing system with a constantly-running checker that must ensure that it's very resource-efficient, etc...
It's just a huge problem. Supposedly Apple is solving this by Q1 2005, but I wouldn't be surprised if we see a massive increase in filesystem corruption bugs for a while on OS X (unless the DB indexing piece is just that, an indexer that runs x times a day and isn't atomically joined to the filesystem operations).
Storage would be one example. I bet there are others.
Trust the Computer. The Computer is your friend.
And that's why it's taking so long. Accessing filesystems as SQL data has always been a dream of anyone who has had many files. They just never knew about it.
WinFS is the 'real' solution IMO to all things like iTunes playlist managers, and expensive Content Management Systems yadi yada.
Sure, no consumer is expected to actually use SQL statements, but that doesn't mean that user mode programs should *implement* SQL features. User mode programs should only be the 'translation' layer between the user's point and click GUI, and the OS' internal implementation of the db. Surely, anyone can see that collecting meta data from the file system, and duplicating it in usermode so that you can have search capabilities on it is wasteful.
This article wasn't news to me, I've actually been waiting for this damn WinFS since just about 1996... And by god, is it ever turning into Duke Nukem Forever, but you know what, it's such a cool feature that I still can't wait for it to come out... (figuratively speaking)
I'll pull out the link again: Storage (a GNOME project) uses some nice algorithms to let you look up anything from '1960s music' or 'films directed by Francis Ford Coppola' to 'pdfs from joe'. All in natural language and over a wide range of formats, although evidently it's still a work in progress.
Let's put this in perspective. In '92 MS was looking at the Sybase source code and thinking about building a new filesystem around a database engine. Chicago AKA Win95 was almost out the door and it seemed reasonable to shoehorn this into Cairo (NT4). They were absolutely the dominant and fastest growing player.
I commented to a collegue in '93 (paraphrasing Robert Heinlein) that I did business with MS for the same reason I obeyed Newton's laws.
What happened around 1995? The internet became a commercial entity. Suddenly, MS needed to provide new applications (like IIS, IE, Outlook Express, an SMTP aware Exchange server, etc.) not just dork with cool OS technologies. A few years later, they are comfortable again after playing catch-up and start thinking about filesystems again, this time in "Longhorn". Again, they started talking about the capability two OS releases into the future.
However, this isn't a feature that is going to drive sales. MS needs to keep developers of home and office apps happy so they develop yet another new graphics system to replace DirectX. The perception of Windows security has never been lower and is starting to affect sales. IIS is losing ground again to Apache/Linux.
It's time to focus on revenue streams again and the revolutionary, expensive, difficult-to-build features get axed. It's probably not a bad idea. Think about the problems they've had with MS-SQL and ask yourself if you want a similar technology built into every teenager's game and grandmother's email box.
ReiserFS version 4 is a database at heart. Its basic structure is just a table of FileName | Binary but it also contains a modular system where it can be expanded for many uses. There is a lot of talk of including meta data in ReiserFS for such a system.
http://www.namesys.com/whitepaper.html
Microsoft "solved" this problem for all intents and purposes by having every program save its files in the "My Documents" folder or a subfolder therein, and allowing for filenames that can be long and have spaces.
Sometimes I feel like Microsoft is rearranging the deck chairs while the ship is sinking. Anyone remember that cool "Tripping the Rift" movie? The ship is falling to pieces and the onboard repair robot repaired the machine that makes ice cubes first. The outraged captain smacked it with wrench and screamed "We're floating in space you decide to fix the stupid ice machine? Get to work on the fucking hyperdrive!!!"
Microsoft need a similar push.
The only safe speed to use an NT Flying Object System is the terminal velocity of an NT4 CD.
ROMANES EUNT DOMUS
IIRC "NT Object Filing System != WinFS"
WinFS is supposed to be based on SQL Server, when NTOFS was announced, MicroSoft hadn't yet acquired SQL Server.
I thought NTOFS was what morphed into the fast-find thingie that shipped with Office.
I don't need no instructions to know how to rock!!!!
One correction - filesystems (at least most UNIX filesystems) are not constrained to tree structure; the leaf nodes may have any number of parents, i.e. a file may be in any number of directories simultaneously. (Use the "ln" command). And using ln -s you can practically place a directory in any number of parent directories.
I use this to organize my music collection alphabetically by artist, by genre, and by the date I got the music simultaneously. (I tend to be most interested in music I got recently, because I'm not tired of it yet).
I know people tend to organize files and directories in a tree structure anyways. If you ask me that's because people are happy to maintain the analogy of a physical item that can only be in one place at a time - so what does that mean for WinFS?
Glomming two related services into one blob of unmaintainable code is not necessarily a benefit. A database mapping has the advantage of being able to catalog distributed file systems, including those which don't have any object tag extensions.
The other problem is that it's not uncommon in the database world to spend far more disk indexing complex data for access than it actually takes to store the raw information itself. Do you really want the possibility that your inseperable all-in-one file system is using more space for the equivalent of directory entries than for data itself?
Remember this isn't about special cases like a user too lazy to sort their home directory or documents folder, but applying that overhead to the entire system. With all the tweaks people do to improve general FS performance and reliability, why would anyone think adding overhead is a good idea unless you need, and I mean need those features?
If you do indeed need those features so badly, why not just buy or use one of dozens of existing document storage and search facilities?
WinFS was just trying to find a way to make people think the two ideas were inextricably bound together and in some way unique to Windows. In truth that honour goes to hundreds of document database and repository products and the long-toothed AS400 (or so my cohorts tell me that work on the platform.)
I do not fail; I succeed at finding out what does not work.
Every filesystem is a database at heart. They already contain other attributes like permissions, create and modify date etc. The place to store this stuff is in the FS because the database is already there. All you need to do is add some more stuff like extended description, a few topic reference fields, and and slap of a query engine on it. The query engine does not need to be real complex either. You can get away with little or no formating/sorting/grouping support as the user space app which performs the query should take care of that. All you need is basic bool logic and string comparision. Most of this code already exists out there under a free license, I am not saying it would be a copy past job but there are examples of required algorithms which developers can look at safely, without running afowl of and IP.
The one tough thing WINFS aims to do that would be simple in user space is it hopes to be able to look in files and gleen some atributes form them. This is great if you can hook into some of the libraries form office or adobe et al, it saves you from having to implement parseing for all that stuff. I am not quite sure how you solve that one at the FS level. I just fear a user space system will get real crufty real fast and break when major changes occur to the files and their real attribes on disk that the DB can't know about. Like if a mount point gets moved or everything is resotored form a tarball and the dates get changed/permissions change a little because someone was careless. I think overall getting the neccecary info form the user when new files are created would be a fair compromise, the only issues is rule one of DATA "crap in crap out".
Then there are all the problems that you mostly have to deal with wether you do it in the FS or as some user space hack/bloatware thing:
Note that file creation would constitute just that you would want/need for efficency archives to contain all that info for the file in them, so the user does not have to enter it. Makefiles and the like would have to be update to do magic and fill in that data for the output files. Then you naturally have to fix all the gui tool kits so their fileIO dialogs support that info, any apps with custom dialogs will need to be patched as will console apps. Some sort of default values would be need for apps that just can't resonably support collecting that info as well. I don't want to have to fill in values everytime I "cat" somethig, I mean to unlink moments later.
I think its clear there are lots of differcult usability problems to solve. Some could probably extend and of the major OSS filesystems to include some extra attributes and add a crude query system, its all a question of what do you really do with it once you have it. I am sure R&D at Microsoft is just as perplexed on that point as I am. I feel sory for them since the marketing dept has been pushing this as the next big thing for almost a decade now, the pressure must be intense.
Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
Except it is relevant because Reiser4 has metadata built-in. WinFS is supposed to be built on top of NTFS but its (NTFS+WinFS) purpose is similar to that of Reiser4.
Time makes more converts than reason
True but Reiser4 is available now. Someone just needs to build a front-end into Gnome/KDE.
Other examples of vaporware in Linux:
- integrated NVidia or ATI drivers
This doesn't fit the definition of Vaporware because no one ever claimed it was going to happen. Besides, you have to download the drivers for Windows too.
- working USB 2 of Firewire support
Works for me, I don't know what problem you are having.
- fast boot-up times
25 seconds including init on a 700Mhz machine is fast enough for me.
What alternate reality are you living in?
Time makes more converts than reason
When I was at college one of the girls I went out with had a step mother who had no ability to organise her own information.
In her rolodex type phone number finder she had several of her friends listed under "H" for "Home number" with a sublist of name and numbers. She had a similar setup for "W" for "work numbers" and "M" for "mobile numbers" with a list of peoples numbers.
Obviously the cards for "H", "W", "M" where quite full as most people where listed there. Other cards where almost empty.
I asked her why she didn't organise people by first names or last names. She looked stunned that at the suggestion.
I would hate to see how this lady organises her computer files, but a search facility no mater how bad would help her alot.
Elivs
--
Sorry about any typoos in my post, Im having a busy day.
Except it is relevant because Reiser4 has metadata built-in. WinFS is supposed to be built on top of NTFS but its (NTFS+WinFS) purpose is similar to that of Reiser4.
NTFS has always had metadata built in. That's not what WinFS provides.
Coming soon - pyrogyra
Wasn't BeOS's BeFS something similar to this?
It was a next generation file system, that afaik, is still superior to many modern filesystems. It even had methods for storing meta data from custom file types (ie- mp3), so you could search for an "artist" field with "Cibo Matto" in it, or whatever.
Also, it used a set block size (1, 2, or 4K) rather than a set # of blocks.
i miss BeOS...... *sniff*
...spike
Ewwwwww, coconut...
Although it makes a nice tagline and dig in the ribs for Microsoft -- same delayed technology, different century, yuck, yuck -- the Cairo Object File System (OFS) and WinFS bear no resemblance to one another. Having worked in the Cairo/NT group at the tail end of the former and suffered through uncountable meetings about the goals/architecture/benefits of the latter prior to leaving MS, I can say this with some certainty. Saying they're the same internally or architecturally because both strive(d) to provide the ability to find any document by any properties or content (aka "information at your fingertips"... remember that?) is just vacuous -- you might as well talk about similarities between file-systems that support shell wildcard expansions * and ?.
OFS was about a lot of things, probably too many things. It was designed during the "object wars" and things like copeland and pink and opendoc were in the headlines. Document-centered work was the proposed user paradigm, where structured documents contained nested opaque data from many different applications, and so applications wouldn't need or want to know the difference between a top-level document or a sub-part of a document. This user paradigm did not entirely come to pass, and so an entire file and object-system architecture and shell user-experience premised on it was canned.
That said, a few features from "OFS" did survive into NT/XP, including:
From what I saw to date, WinFS seemed to be about the data/XML paradigm of data format transparency, not about opaque nested/contained data like OFS. It seems to be pursuing a different usage paradigm. At least I think so.
It's a confusing thing, and it shouldn't be. The basic idea of fusing a DB and a FS is dead simple, and if every OS offered structured and unstructured data, a set of simple core schemas, federated query across the two forms of data, and transactional/ACID cross-references between them, you could build many applications more easily. Why WinFS keeps taking so many more bits to describe itself than this is beyond me.
n@
That's not Webster's Dictionary. That's just another cheapass website which tries to make money by taking Wikipedia's content and jamming some ads on it. And webster-dictionary has the added quality of trying to rip off the good name of the real Webster's dictionary
(I'm pretty sure Webster's Dictionary's trademark has long since passed into a more nebulous place.)
"Never attribute to malice that which can be adequately explained by stupidity." -- Hanlon's Razor
As far as i can see, there are two different concepts in that thing:
- The real FS part: ReiserFS-like storing of a file/dir architecture, which is nice, disk-space-savey and all, but has no consequences on the way people work. Furthermore it already exists: i'm using it right now.
- The self-organized document hierarchy and search capabilities, which might change the way people work for the best, as far as it's restrained to *very specific parts* of your data. Who would trade a well crafted UNIX dirs architecture for a key indexed FS? What about dirs related documents, like a hierarchy of Java packages? What about URL accessible documents? What about implicit (not already keyword-based) relations between documents? And so on... In most cases, this stuff would have to emulate a standard file hierarchy anyway, which would probably result in system resource overhead only, or would require that you specify explicit keywords (not really knowing how they would impact the search algorythm), which would result in user resource overhead only.
You get my point: this stuff must be an option, and it belongs to the user interface, as in DBFS or Google, with a standard lib/API for easy re-usability by tiers software. It would be of no use with MOST of the files, in my system anyway.
WinFS is not even a solution looking for a problem, it's a problem seeking naive clients for its solution, IMHO.
"Take away our PlayStations
And we're a third-world nation"
A.D.
No, windows server 2003 presents the login screen faster but the network services aren't all loaded yet, and if your running a server surely you want the services running and available?
Linux won't display the login box until after everything is initialized, windows will half load itself, show the login and continue loading in the background.
http://spamdecoy.net - free throwaway anonymous email - avoid spam!