Next Windows to Have New Filesystem
ocipio writes: "Microsoft is currently planning a new filesystem. Its planned that the new filesystem will make searches easier, faster, and more reliable. Windows will also be less likely to break, and easier to fix when it does. The new technology will cause practically all Microsoft products to be rewritten to take advantage of it. Called Object File System, OFS will be found in the next major Windows release, codenamed Longhorn. More information can be found here at CNET."
Refreshing to see an MS news item that has no bashing in it what-so-ever. How about we keep the discussion mature, also?
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
I thought I'd read at some point that they were going to make their filesystem based on sql server to improve performance and searching.
This is probably in response to open source software people finally
figuring out most of (the undocumented) NTFS. They don't want Linux,
*BSD, etc. to be able to read and write their filesystem easily, as that
would make it easier for people to dual-boot and/or migrate away from
Microsoft operating systems.
Note the historical sidebar on the article. It traces the on-again, off-again history of OFS. MS has been playing with it for over half a decade (!), and doesn't yet have anything to show for it. They've backpedalled and caught up again so many times that I think this article can be safely labelled as speculation.
In other words, it sounds cool. I'll believe it when I see it. (and only at that point judge whether it really makes Windows less likely to break)
"People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
They want to get their Digital Rights Management Software to infest every aspect of their OS as possible.
Do you honestly believe that the benifit of a faster search is enough incentive to rewrite such a major part of the OS?
"The market alone cannot provide sufficient constraints on corporation's penchant to cause harm." -- Joel Bakan
It looks like BeFS with XML descriptions instead of MIME types. I think.
[o]_O
From the article:
Replacing its antiquated file system with modern database technology...
Now, if you were going to base a file system on a DB, what would you use? An Object-Oriented DB? Where organization is key (which you want for a file system), or a Relational DB for speed (which is why they are claiming to switch)?
I'm sure they are going to make a custom system, yes, but wouldn't it have to be based on one of the two major DB designs?
For the record, I'm no DB Admin, but, as I understand it, relational is the choice of DB for almost all projects for its sheer speed, OO is only good for academic reasons to show off organization...
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
Big brother is watching!
I hope any changes that happen to the file system also include the removal of the antiquated concept of file extensions for type association. Here is another thing that Mac does very well. Imbed the type of a file IN the file. Why not give me a version number and some way to know what program created it.
Back to the original topic, I can't wait for an OFS. Just for my MP3's. Figuring out which folder hierarchy to use for genre/group/album/track is a pain. Let the file system group them for me.
A speech...
... right before the sledgehammer hits.
Early versions of BeOS had a full object orientated file system and found performance was abysmal. This was from a company with no backwards compatibility to worry about and a small OS designed for speed.
In the end Be developed BFS which is basically a standard file system with support for indexes and attributes, an overall much better performing system with most of the benefits of an object orientated file system.
[)amien
Sorry i don't think windows has suffered a major file system corruption bug in a public release...
Now linux, yup, linux has had plenty of filesystem corruption problems...
Sorry captain zealot but this is one area where you don't win.
A problem I'd really like to be solved is the way that file extensions are registered (and then fought over by programs). Granted, this is in some part the fault of software companies (cough, real, cough), but if a more elegant solution existed to that sort of mess, then maybe it wouldn't be so annoying. I would equate that to if a program of mine that ran ".dum" files found and deleted shortcuts to other programs that ran ".dum" files -- and that's just unacceptable.
Down with MS? Nah, but the benefits listed here of an new FS don't seem to justify its cost (having to reprogram everything to take advantage of it... ouch!).
-Sou|cuttr
It really is an interesting problem. My Wife's iMac only has a 6Gb drive. She's always saving info form the web into AppleWorks files. She has generating a LOT of little itty bitty files. Now our PC file server stores our digital photos and mp3s, and both iTunes and iPhoto make managing that mess quite easy and Sherlock is SUPPOSED to make finding in files easier (it kinda does), but does a poor job at it.
Now, my point is, I've actually thought about setting up some form of database so that my Wife can find her info for years to come. But my biggest question is NOT would a database help, I'm sure it would. What I would like to know, is how would the interface for that database look?
Considering what I have seen of XP (I got a copy sitting on a 2GB drive that sits on my shelf), MS knows very little about information management in the UI, and I would expect this problem to not get any better for the majority of PC uses, even if the entire file system was one big database.
Burn Hollywood Burn
The point of this new data store isn't necessarily faster searches, although that is one part of it. The idea is to have a common data storage mechanism, used by all programs.
The underlying technology is to replace the NTFS filesystem driver with SQL Server, with a few tweaks. SQL Server already supports using a RAW partition as a data store, so essentially you just have to move the transaction log and descriptive info for the databases into a specific area of the disk. Add a little bit of bootstrap code to ntldr, and slap the SQL Server stuff into the startup driver list, and it's a done deal.
The next step is creating an NTFS compatibility layer -- it would allow you to mount tables as drive letters or network shares. A lot of the information wouldn't be useful when viewed in that fashion, but it would give you a way to run older programs.
Once all your data is in a common data store and can be manipulated as such, it opens up a world of new possibilities. The change will be long and slow; no need to kid about that. It will take years for all the 3rd party programs (and even Microsoft's own apps) to catch up and start taking full advantage of it. It's the same situation Plug & Play was in back in 1995; it sorta worked sometimes, but you couldn't really take full advantage of it. But here in 2002, you really can expect to grab a piece of hardware and slap it in your box without hassles. It took some time, but it eventually paid off.
But... are you having trouble, as I did, thinking of ways to make use of this common data store? Part of that comes from the fact that we've been conditioned and trained to think of data storage in terms of files; it's hard to shift gears... to think outside of the "filesystem" box so to speak.
For one thing, I could see someone emailing me a project. Not some word documents, an excel spreadsheet, and a database zipped into a ZIP file; they just email me the project. When I get it, and open the message, the project opens up presenting me with the various documents (linked to the database of phone numbers for example), and a little yellow stickynote window that has the project leader's actual email text. I didn't have to deal with unzipping the data, rearranging it, then opening the documents separately. Since the "rows" are linked, they open and act as a unit until I tell them to do otherwise.
It gets better though... imagine if I could run a query such as "SELECT f.*, s.filename FROM Folder1 f INNER JOIN folder2 s ON f.datetime = s.datetime"
It can get even more useful because you now have full SQL syntax available to you for manipulating the filesystem, with queries that are lightning fast. Throw in some Stored Procedures, Functions, Views, etc and I can see real possibilities.
Natural != (nontoxic || beneficial)
And IBM's AS400's have been doing this for years. Not to mention BeOS. ReiserFS is of perticular intrest because it will allow for attaching of arbitrary objects to any node. Only problem is we have five next generation file systems duking it out so generic Linux will most likely not see the benifits for some time as nobody will want to program specificly for a filesystem that reaches only a fraction of the usebase. It would be sure nice though to not to have to see .nautilus-metafile.xml stewn about my file system. sigh! What would be nice is generic system calls for filesystem metadata that would write out .nautilus-metafile's for FS that don't support metadata and node metadata for FS's that do. Of course we would need a standard format but it would be instantly useful.
Why fuck with the file system? Why not
just use a database in the first place?
There doesn't seem a reason to have file
system semantics for this sort of thing.
Especially when there are so many database
tools in place.
Unfortunately, Microsoft has exactly the wrong platform to implement these ideas on. The whole motivation behind this kind of thing is to simplify the software. Microsoft needs to be backwards compatible with 20+ years of cruft, and they have an abysmal record for writing clean, simple APIs.
This will probably end up being just another popular software engineering idea that ends up being superceded by new business plans later on. It will become yet another ossified layer in the lower sediment of their future OSes (see DCOM, etc.).
The submission form is acting weird so here is the link again: http://www.pbs.org/cringely/pulpit/pulpit20010802. html.
This New File system sounds to me like something similar.
Everthing is a file, says Unix.
But that was 30 years ago. Perhaps its time to extend the unix-doctrin: Everthing is a file and a directory.
Why? Metadata.
Todays file-formats store informations about the file inside the file (thing id3-tags) or abuse file-attributes (such as the filename(.html)).
With files as file and directory, there would be no need for that. Imagine: you store informations about the authors of a file inside metadata-attributes. There would be simple possibilities to search for these informations, so one could easy pick up all "draft" files inside a direcory (ls -al *../status=draft, maybe)
p.
NTFS is a very solid filesystem and seems to recover problems well when something bad does happen. The only complaints I have are slow searches and reports. It takes a LONG time to find a file on a big volume, or try and do reports on file system usage. A good database system should speed that up tremendously.
The idea of having to rewrite the apps is interesting though. That tells me this is at least 5 years off, and longer before it would be used widescale. But I guess that makes sense, would you be the first shop to put your big fileserver on a new filesystem like that? Not me.
In the process, the plan could boost Microsoft's high-profile .Net Web services plan and pave the way to enter new markets for document management and portal software, while simultaneously dealing a blow to competitors."
OK I know FAT is antiquated, but NTFS is modern. In fact I recall it was announced at some point 3-4 years ago that OFS wasn't necessary because all the relevant features were being merged into NTFS? Maybe that was an internal announcement, one of the annual "we are finally merging our data stores" emails the top Microsoft brass would send out to the troops.
Anyway I don't see why this would make Windows less likely to break or easier to fix, or what it has to do with .NET...why does that kind of marketing fluff have to be included in a pretty reasonable article (and the sidebar is very nice)?
- adam
So they are incorporating the work of Hans Reiser~! Great idea MS! Perhaps slip in some DRM, maybe some NSA features as long as they are continuing to appropiate everyone else's ideas
No, they can't be doing that because as Hans Reiser claims, you can't do that kind of filesystem without modifying the NT kernel[1].
This idea has been around Microsoft since 1994 at least... possibly earlier. It was the whole idea behind the Cairo project.
Simon
[1] Personally, I think he's on crack with that statement, but hey, if he wants to go ahead and sue Microsoft (he claimed as much on the AM-Info mailing list) to get his filesystem working on Windows because he doesn't understand how to write a filesystem driver for Windows, then he can go ahead.
Coming soon - pyrogyra
It is funny, we accuse Microsoft of using other people's ideas - but are we really any better? How much of Open Source development is really just reimplementations of other people's ideas?
When I read this article, I immediately had two thoughts:
Thought 1: "You know, they're right" Current file systems are outdated and are not really serving the needs of modern applications. Take for example, Microsoft Outlook (and Outlook Express). The programming teams for these pieces of software were forced to implement a "filesystem within a file" in order to achieve their design goals (I believe the files are called DBX files). Or take for instance, the Windows Registry, or, even better, the Gnome registry, GConf. Why do programmers have to implement dozens of different abstract filesystems in order to achieve their design goals? Simple, the present filesystems are not sufficient.
Thought 2: "Another way of attacking the Free Software Movement." By creating a new filesystem, Microsoft achieves many goals. First, they make Linux filesystem developers start from scratch again. I mean, the NTFS driver isn't even done, and this means we would have to start over. It gets even worse: From the sound of this article, it seems that OFS would be fundamentally incompatible with our conception of a filesystem today (possibly including features such as resource branches, GUID tags, and other metadata forks, ad nauseum). This would make it difficult to write a usable Linux driver for OFS. And finally, to top it off, my gut tells me that the POSIX file access calls would _not_ be sufficient to access such a rich filesystem. The introduction of a new, richer file access API by Microsoft would make writing cross-platform software much more difficult.
Microsoft can kill two birds with one stone here.
Ben
If you take a look at the XP interface, it feels (to me at least) a lot like a candied up BeOS -- a lot of the icons have a similar look, there's the grouped taskbar items a la the BeOS tracker, etc. And seeing as BeOS has been around for years, it makes a lot more sense that the Microsoft engineers would have been able to start reimplementing ideas like this by this point.
And now we start seeing articles like this one, and it becomes clear that just as the XP interface has started to resemble BeOS, the XP native filesystem is starting to resemble BFS. This isn't the first time in recent months that we've seen reports of this -- not long ago there were articles saying that MS wanted to ditch Access and it's Jet engine (or whatever it runs now), and turn the SQLServer engine into the core of the next generation filesystem. This is of course exactly what Be wanted to do, but couldn't due to performance constraints, so they went with the scaled back object oriented system instead. Hey look at that, now we hear that Microsoft is also going with an OO-FS instead of a full SQL-FS.
Microsoft already ran Be out of the market, and are rightfully getting sued now for doing so. I wonder if Be would be willing to use this increasingly familiar evolution for Windows as evidence that Microsoft wanted to eliminate their strongest OS competition while ripping off all their good ideas. As much as it's vindicating to see that BeOS's best features will live on in new versions of Windows, I'd rather have the chance to see the original around today...
DO NOT LEAVE IT IS NOT REAL
It's called BeOS...
It's kinda intriguing how a company with a decent, fast, stable, innovative product like the BeOS, which by the way, goes belly-up forever on March 15, 2002, and also which has filed an anti-trust suit against Microsoft suddendly has one of it's most publicized features (the SQL-like query engine and attributes built into the file system) being pronounced by the idiotic mass-media mavens as a new, innovative Microsoft idea... My gawd - don't we even wait until the companies bones have even started to rot before we begin stealing?
"We've been working hard on the next file system for years [since the early 1990's], and -- not that we've made the progress that we've wanted to -- we're at it again," Ballmer said.
While the Cairo project eventually resulted in Microsoft's Windows 2000 operating system, the file system work was abandoned because of complexity, market forces and internal bickering. "It never went away. We just had other things that needed to be done," Jim Allchin, the group vice president in charge of Windows development, told News.com.
Those other things most likely included battling "Netscape and Java and the challenge of the Internet and the Department of Justice," Gartner Group analyst David Smith said--issues that continue to persist today.
<snip>
The more important reasons for the renewed development effort, however, are strategic. If the plan succeeds, it will give Microsoft a huge technological advantage over the competition by making its products more attractive to buyers and giving large companies another reason to install Windows-based servers.
So if they hadn't been trying so hard to kill off Netscape, they would have had the time to spend on creating this. Something that seems to offer actual advantages to the user, and that would be "a huge technological advantage over the competition by making its products more attractive to buyers."
I wonder how many other genuine advances have been put on hold in the name of detroying someone else first.
Nope, no sig
Seriously, though: how are you going to search the content of one of those files, anyway? AFAIK, searching images for content is very rudimentary (try Google's Image Search feature, which is the best thing out there but still pretty bad), and searching audio or video....forget about it. The only moderately successful approach I've seen is the metadata that Fasttrack clients (KaZaA, Grokster, and formerly Morpheus) track, but I'm pretty sure all that has to be entered in by hand, and it's usually wildly inaccurate.
If you want text to be easily searchable, you're best off sticking with plain text. For binaries, the best scheme for now is probably some sort of embedded metadata scheme like ID3 tags for MP3's, but ultimately, that metadata has to be added manually (although you could store such metadata in a database like CDDB to automate metadata creation when ripping CDs, for example).
"It take 9 months to bear a child, no matter how many women you assign to the job."
Aside from the fact MS is evil, and they don't innovate. It is good to see MS borrowing from other companies and academic research to improve and fix windows. Though it scares me a bit to think they are using code from sql server. I don't consider MS sql server an enterprise quality RDBMS, but maybe it has improved since sql server 7.x.
Faster Searches If someone thinks that their search needs to take 10 seconds less, what is wrong with the current Indexing Service that is in win2k/xp?
More Comprehensive Searches Why not have the current search program be able to read more file formats than just simple text files. There is no reason to force a database on the system.
Windows is less likely to break Why would I believe that? That has been said about every version of windows. I believe that with a newer OS, there will just be a newer set of bugs for MS to [hide from public]/[deal with]. Programmers make mistakes. It has also been shown that you can't test non-trivial code for absolute correctness. Windows will always break in one way or another, just like any other piece of code, except that we must continue to rely on MS for the fixes.
Indiscriminant Rants
I share your concerns about interoperability and the upgrade cycle. However, this move is a good thing. All the major OSes out there need to think about file systems as more than just a filename and data. This is because humans are capable of doing more than that, and we shouldn't be limited by what computers used to be able to do.
For example, think of a boring filing cabinet. While you can just create dividers and add more filing cabinets, humans do much more than that. For example, at my doctor's office, they color code each folder. In addition, they apply stickers to the tab on the folders for certain indicators (last update of the file by year, insurance type, which doctor I usually see, since there are several there, etc). As a result, simply by scanning the shelves, they can tell a number of things about the files without having to pull them out and open them up.
Same thing about the FS. It would be nice to be able to tell something about the file without having to issue the open call. That's a good thing. Currently, most apps limit themselves to one hint (the extension). What's wrong with more?
Everything you pointed out in your message, while valid questions, are mostly elementary engineering problems. For example, the two file/different extension problem can be solved a number of ways (MacOS already has to deal with this condition, for example).
In the future, it might be better to have RFCs for an new, standard FTP or whatever that allows a metadata section as part of the DATA transfer. This wouldn't be too hard either (HTTP already could get away with this, since you can define whatever headers, more or less, that you want).
The real concern is interoperability. I can imagine a "compatibility mode" for network aware services, like file sharing or FTP/Web, that present file names to the remote user in the old filename.extension format. That's actually almost trivial.
Namesys is already working on reiserfs (which does something similar). BeOS had something similar, too. NTFS already ran into part of the problem (which stream do you want). We shouldn't hold back because it might require some compatibility for a short period...
Realize that I'm a big skeptic when it comes to Microsoft... I'm worried that they won't do anything to help non-Windows interop. But as an idea, I'm all for the updated FS. To borrow your own phrasing, a file is a file, but I want more hints to help user applications like searching through files (how do you know a file contains an ID3 tag?)
But I like this concept.
Sujal
politics, food, music, life: FatMixx
You may have misunderstood. They were planning on spending the entire month of February, shortest month of the year, focused on bug fixes. It's March now. They're done focusing. Of course, we're all anxiously waiting to see the results of this focus.
JOhn.
Let's see...Office uses "filesystems within files" named "OLE"...Microsoft is re-vamping the filesystem for Windows...the new Windows FS standard is named "OFS"...see a connection? Hint: if all your major apps are already effectively implementing their own filesystem in userspace (read: slowly), why not move those capabilities into the kernelspace drivers?
Personally, I think this is the right direction to go; just like Reiser's papers on "next-generation" versions of ReiserFS (with keyword and metadata searching built-in), along with the extensive work being done of VFS abstractions for almost every OS. Hell, you could go back to Be's attribute indexing on BeFS, or even the old MacOS resource fork.
Basic UNIX filesystem trees are far from the last word in this area; if they were, no one would need MySQL to support all their webapps. (Actually, it would be interesting to get the MySQL guys involved in some filesystem design work; I'm sure they would have an interesting perspective to offer.)
This is only the public unveiling of a technology that has been under development for some time, probably as part of the Cairo project. Our first glimpse of it was actually in the first Halloween Memo of 1998, whence it was referred to as 'Storage+'.
Eric's summary from the relevant section:
"I'm told by a former Microserf that the references to "Storage+" here and in the executive summary are much more significant than they seem. MS's plan for the next few years is to move to an integrated file/data/storage system based upon Exchange, completely replacing the current FAT and NTFS file systems. They are absolutely planning on one monolithic structure, called "megaserver", as their next strategic infrastructure. The lock-in effect of this would be immense if they succeed. "
The only time I've LOST files off of any FAT16/32/NTFS drive was back in the FAT16 days. I made the youthful mistake of getting DOS 6.2 from a BBS (at 14.4 those days... thank goodness.. well, compared to a 2400). I then proceeded to install a DoubleSpace drive. Then, I got tired of it, removed the driver from config.sys, and deleted the DoubleSpace file. Well, after random files started disappearing from our computer, I called Gateway 2000 and their tech informed me that that MS had something going on "behind the scenes" that would seriously screw up if you deleted the compressed drive file. Well, this was my father's computer (with all of his law firm's documents on it), so I had to spend the ENTIRE night copying all of his files to 250 1.2MB 5.25" floppies. Then I had to fdisk, format, and re-install DOS/Win. Needless to say, I have NEVER messed with drive compression since.
Sorry for the long post. Crazy flashback. It was one of those moments where you are infinitely wiser immediately afterward. Now that I look back on it, the only reason I'm so comfortable with computers is that I broke them so often. Eventually there was no one else to call, no more books to read, and no INTERNET (well, shell dialup doesn't count). The fear of a 260 lb. 6 foot 5 father who's law firm is on the line is an amazing motivator.
.. because we no longer can trust what we see on the desktop. you would be able to discard the true identity of a file from the user. like renaming 'evil_virus.exe' to 'naked.gif', giving it the standard icon of a .gif file, but keeping the meta-information ("windows exectuable"). No user will check the meta information before clicking on this file.