Microsoft Uncertain About WinFS for XP
Ant writes "As a follow-up to WinFS to be available in WinXP story from a few days ago, BetaNews reports that Microsoft (MS) stopped short of confirming reports that it plans to back-port its next-generation WinFS file system architecture to Windows XP. MS tells BetaNews it is only evaluating the move while also acknowledging WinFS is still years off. "We are currently evaluating making the WinFS storage subsystem available on this platform and will make the decision based on what is best for customers." a Microsoft spokesperson told BetaNews."
WinFS runs on top of NTFS. Get your information straight.
You don't understand Spotlight.
Index Server did just what it says: It indexed file contents. Every operating system can do that. The Mac, the platform with which I'm most familiar, has been doing that for at least five years now, and probably longer; I can't remember exactly.
It's not useful, and here's why: The days when most files were plain text are long gone. There are still plain text files out there, sure, but they're the vast minority. Most computer users probably don't create them at all, in fact.
Instead, people have e-mail messages (which are stored in plain-text files, but which are not plain text; they are in fact filled with what looks like gibberish to the casual reader), audio files, photographs, PDF documents, and application files. Most of your application files these days are being written in XML format which, like e-mail, is stored as plain text on the disk, but is filled with lots of stuff that's not related to the contents.
So merely indexing the contents of text files is not useful.
That's why Spotlight does things completely differently.
It's kind of hard to imagine that there's somebody out there who doesn't already know exactly how Spotlight works -- Apple's only been talking about it incessantly since last summer -- but I guess I have to concede the possibility. So let me explain it.
There's a program that runs in the background all the time. It's called "mds," for "metadata server." It's a system service; people don't interact with it directly. The purpose of mds is to store all the metadata on the computer and to respond to queries.
The mds program gets its metadata from another background task, mdimport, or "metadata import." The mdimport program reads files, extracts all the information from them it can, then passes that information off to the mds program.
The mdimport program is extensible through modules called metadata importers. Each metadata importer corresponds to a file type. When the mdimport program examines a file of a given type, it fires the relevant metadata importer module(s) to extract information from that file. Each metadata importer implements exactly one C function: GetMetadataForFile. This function receives a path to the file to be examined, a file type and a pointer to a key-value-pair data structure called a "dictionary."
GetMetadataForFile populates the dictionary with metadata stored as key-value pairs. When it returns, the mdimport program passes that information off to the mds program for storage.
The important idea here is that GetMetadataForFile can do anything to the file to extract metadata from it. A metadata importer might pull ID3 tags out of an M4A music file. Another one might extract EXIF metadata from a digital photograph. Another might parse a word-processing file in XML format, discard everything irrelevant, and return just the names of the fonts used in that file. Another might pull the date stamps out of a chat transcript and store them as start-time, end-time and duration metadata. Another might pull key frames from a QuickTime movie and store them as thumbnail data. Another might find e-mail messages with attachments and store the type and size of the attachment as metadata. The sky's the limit.
Spotlight is way more than just simple content indexing. It does content indexing, of course, using a new version of Search Kit, but that's just a part of it. (It's also not really that new. It's just a slightly optimized version of what's already in Mac OS X.)
As usual, the casual dismissal of something fairly revolutionary can be blamed on a high degree of ignorance on the part of the person doing the dismissing.
In that case, that's about half of what WinFS is supposed to be. It will make greater use of metadata, probably through the already existing NTFS streams in e.g. Windows 2000 and Windows XP. Yes, you can already store and search true file system-level metadata in those operating systems, an almost as little known fact as that you can mount devices in Windows XP to "folders", similar to how it works in Linux. I can for example mount my DVD-ROM at E: to C:\Devices\DVD. Anyway, that combined with the WinFS service running on top of NTFS helping out with indexing to allow instant database-style searches, should offer something similar to Spotlight functionality, if I understand Spotlight right.
However, there's more to it than fast database searches in WinFS. It also aims to change how we look on stored files altogether, taking away system-related concepts like "hard drives" and physical "folders" when navigating your stored data. Instead, your data will be organized into more abstract libraries of data. You'd for example store your games in your Game library, whose contents wouldn't be tied to one folder on one hard drive. You'd go to your Game library, and double-click on Doom III, instead of going to C:\Games\Doom III. Actually, C: wouldn't even be a concept seen by the user anymore.
It's even supposed to seamlessly work through network shares, however last thing I heard is that won't be in the initial release of WinFS.
So it's a new data model, and a new way to look at how you store data altogether.
All this is how it may look to the user. However, to Windows, it's a storage engine running as a service on top of NTFS.
Very early stages of WinFS could be found in the already released/leaked Longhorn alpha versions. Although you couldn't really say it was anything near functioning, you could see the concepts, and that was likely the intention at this early alpha stage.
Here are some quotes from Paul Thurrot's site:
------------
Beware: In C++, your friends can see your privates!