'Storage' to Replace Traditional Filesystems?

Windows? by Iron+Monkey543 · 2003-09-05 00:53 · Score: 1, Informative

I thought current windows filing system was already database-based? Not that i know anythig on this matter, i just thought I read it somewhere. Can someone enlighten me please

Re:Windows? by henbane · 2003-09-05 00:54 · Score: 1, Informative

Longhorn will be database based.
Why don't these people just put some effort in reiserFS?
Re:Windows? by Zocalo · 2003-09-05 01:18 · Score: 5, Insightful

Not quite, NTFS is a traditional file table with some bells and whistles, but it's not a "database" in the sense meant here(1). The next version of Windows, "Longhorn", is supposed to introduce a new file system called WinFS that will use a version of SQLServer as its backend. Whether they will actually deliver or not is another matter, since we were promised this in 1995 with Cairo and Taligent (remember them?), and now that Longhorn appears to have been pushed back...
There are also issues with gaining acceptance for the change in the way things work. This kind of thing has not really been done on a large scale in the wild before, on any OS, so whether people will be willing to accept the security and reliablity issues that may ensue is another matter. For example, what are the implications of a compromise in the database engine? MS is planning on using SQL, so if things go awry and it becomes possible to maliciously inject raw SQL to the filesystem interface... Oops. On the otherhand, the benefits for data retrival are *huge*. Imagine being able to find any audio files on your entire system by Justin Timberlake or Britney Spears and delete them all in one go by searching on the tag fields! ;)
(1) Technically, all filesystems are databases, it's just that current ones are a collection of flatfile database tables that can point to each other, generally in a heirarchial manner. When people say "database" in the same sentence as "filesystem" they usually mean "relational database". As an aside however, high end databases usually forgo the need for a file system and provide the ability to write their tables directly to disk on a dedicated partition.

--
UNIX? They're not even circumcised! Savages!
Re:Windows? by Anonymous Coward · 2003-09-05 01:27 · Score: 3, Funny

SELECT * FROM videos WHERE name LIKE '%porn%'
Re:Windows? by rwven · 2003-09-05 01:34 · Score: 1, Interesting

ah, reiserfs... i use it for everything. i would totally have to agree with you on that.

from what i hear though, database driven file systems are quite slow... can someone clarify that for me?
Re:Windows? by Mysticalfruit · 2003-09-05 01:38 · Score: 1

The problem I see with this is the problem that plagues all databases... garbage in, garbage out.

If I had to go through and fix all the id3 tags on my mp3's so that my filesystem would give me adequate searching facilities, Longhorn would be long since released...

In the same realm, I have close to 150gigs worth of TV shows archived in DIVX format. The prospect of going through all of them and putting correct tags in, sounds daunting...

However, I do suspect that any robust interface would take a look at the tags, and if they are empty attempt to parse the filename.

--
Yes Francis, the world has gone crazy.
Re:Windows? by Anonymous Coward · 2003-09-05 01:48 · Score: 0

Actually, the BeOS prereleases many years ago had a database instead of a traditional filesystem. But as the computers those days were slower and the technology was not there yet it was replaced by a database alike journaling filesystem in the actual releases as a forced reboot meant many minutes of checking the database integrity (the most robust jfs I have encountered so far I might add).
Re:Windows? by Zocalo · 2003-09-05 02:22 · Score: 2, Insightful

However, I do suspect that any robust interface would take a look at the tags, and if they are empty attempt to parse the filename.
Actually, I was just thinking about this problem, and you know what would make a *really* easy solution and is readily available already? P2P! Think about it; a new file arrives on the system by whatever means, so the file system has zero idea about it's nature beyond what's available from the file. We probably know the type of file from its header, extension or whatever other "file" command type trick was required. We also know its size, any tag type information that may be present, the filename, and we could maybe calculate a checksum too. So we fire off a P2P query with what we have and what we want to know, then wait for responses.
Sure, you will probably get responses that conflict, so some kind of progessive weighting and elimination system is required. If you search on Kazaa and look at the meta info returned, it's fairly easy to see what is correct and what is not; automating this analysis is the next step. There is also the probabilty of CDDB type services springing up to act as the "Supernodes" of such a system, or as dedicated standalone services.
Of course, you probably wouldn't want the OS doing this for you automatically. Imagine the fun and games that would ensue if you started getting Bill G. sending out P2P queries to fill in the meta tag blanks on a document about "increasing revenue through tweaking our licensing strategy again"! ;)

--
UNIX? They're not even circumcised! Savages!
Re:Windows? by laird · 2003-09-05 02:33 · Score: 1

The idea isn't that you'd manually put in metadata for files, but that applications would use filesystem metadata instead of internal data stores. So your photo organizer would store title, date/timestamp, etc., as filesystem metadata so that it could be used outside the photo organizer. (Note: there's tons of metadata captured automatically by digital cameras; it's just not accessible outside of a photo organizer app) Or, an example that the Be folks used, your email program could store each mail message as a file with metadata for date, sender, subject, etc. So to see all email sent to you by "bob" by opening a window in your file browser and telling it to show you things of type "email" with sender "bob". And as more email came in (an asynchronous process that just writes files to the filesystem) it's immediately appear in the window. Amazingly cool.

--
Enable 3D printed prosthetics!
Re:Windows? by naph · 2003-09-05 02:48 · Score: 1

i believe that actually they have scaled back their plans a great deal. WinFS is going to use NTFS as it's main system of file storage, WinFS is just another layer on the top that uses SQL Server to manage file attributes. It's not as ambitious as they originally planned.
RieserFS 4 is worth a serious look if your interested in file system development. there was a great article over on kuro5hin about it a few weeks ago.

--
"if i'd known it was harmless, i'd have killed it myself"
Re:Windows? by Anonymous Coward · 2003-09-05 02:57 · Score: 0

So ... if this Linux idea is "innovative" ... and Microsoft have been working on it since 1995 ... then Microsoft are innovative.

Haha Slashdot implied Microsoft are innovative.

Eat my goal.
Re:Windows? by fyonn · 2003-09-05 03:04 · Score: 1

you know Zee, I remember discussing the viability and usefulness of database driven filesystems about a year or two ago at our last co and you swore blind that they were a waste of resources with very little practical application :)

I have to admit, I can see some usefulness of this kind of approach depending on how it's done. the ability to keep queries on the fs around as "virtual directories" for example could be very useful. I like the idea of:

mount_sql /var/maillogs "select files from "logfiles" where program = "exim" or program = "courier-imap";

or even the ability to create raw files from the realtime merge of several files. I know that we can effectively do alot of this all now but I do think it holds alot of promise as a new FS paradigm, esp if one adds the abiliti to extract meta data from the files themseles, the way that rieser4 was promising us. one could have standard filesystem plugins to cater for all the different filetypes you have.

ls 'select files where type="audio" and type="video" and artist="shakira"'

should bring up all your mp3's and avi's of the columbian singer. of course, this all relies on good metadata, but isn't that what we rely on l33t kazaa rippers for?

dave

PS. yeah, I'm out of practise with my sql statements, so shoot me, you know what I mean :)
Re:Windows? by cens0r · 2003-09-05 04:17 · Score: 1

ls 'select files where type="audio" and type="video" and artist="shakira"'

not to nitpick but this would be better:

ls 'select files where (type="audio" or type="video") and artist="shakira"'

--
Jack Valenti and Orrin Hatch will be first up against the wall when the revolution comes.
Re:Windows? by Anonymous Coward · 2003-09-05 05:33 · Score: 0

I know you were trying to be funny, but....

How about

select * from videos where genre = 'porn'

Or even less funny, but useful;

select * from videos where genre = 'scifi'

select * from videos where name like 'firefly%'
Re:Windows? by netsharc · 2003-09-05 05:59 · Score: 2, Insightful

It's innovative because it's an idea implemented on Linux, whereas when it's to be implemented on Windows it's, a lousy idea (well, lousy because of 3rd party compatibility nightmares).

--
What time is it/will be over there? Check with my iPhone app!
Re:Windows? by Hast · 2003-09-05 06:39 · Score: 1

You can use Bitcollider instead. It allows you to make lookups to a database using the hash of a file. (Several different hashes are available.) You can also make comments on the file regarding quality and descriptions, which are then available online.
Re:Windows? by Anonymous Coward · 2003-09-05 07:13 · Score: 0

Imagine being able to find any audio files on your entire system by Justin Timberlake or Britney Spears and delete them all in one go by searching on the tag fields!

Yes, I have this problem all the time.
Re:Windows? by leandrod · 2003-09-05 07:30 · Score: 1

> Microsoft have been working on it since 1995

Before that. MS Access was intended to be a development platform for Jet, and MS Windows would eventually gain a Jet-based filesystem. Only Jet was considered harmful, so MSDE was substituted and the whole thing delayed a decade or so.

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:Windows? by Anonymous Coward · 2003-09-05 07:37 · Score: 0

>RieserFS 4 is worth a serious look if your interested in file system
>development. there was a great article over on kuro5hin [kuro5hin.org]
>about it a few weeks ago.
>
>
Did they mention just how totally worthless and useless filesystems like RieserFS 4 , 'Storage' and the crap Microsoft is working on will be in the *REAL WORLD* ?

Sorry, I just don't give a shit about any of the crap that's comming out of "the filesystem development community". It's basically all a bunch of bullshit that cause nothing but problems when you try acessing the data contained in these lame-assed concepts 5 to 10 years later. Isn't it interesting just how "the filesystem development community" keeps trying to inflict these idiotic filesystems upon the world. It's like they are actively trying to create systems in which people have to jump through all sorts of hoops and other garabage to acess their data.

Spare me from this bullshit, please!
Re:Windows? by __past__ · 2003-09-05 08:03 · Score: 2, Funny
Why don't these people just put some effort in reiserFS?
- Because some people value their data
- Because some people think "free software" doesn't mean "software you are free to modify as long as it doesn't interfere with Hans Reisers business interests"
--
Programming can be fun again. Film at 11.
Re:Windows? by leandrod · 2003-09-05 08:53 · Score: 1

> When people say "database" in the same sentence as "filesystem" they usually mean "relational database".

They may think "relational", but usually actually meaning only SQL. Unfortunately SQL is just a corruption from the relational model...

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:Windows? by fyonn · 2003-09-05 13:30 · Score: 1

err yeah, what he said :)

dave
Re:Windows? by koh · 2003-09-08 01:34 · Score: 1

I don't like to nitpick either, but this would be even better :

ls 'select files where (type="audio" or type="video") and artist!="shakira"'

Just my 2 cents ;)

--
Karma cannot be described by words alone.

i think by Tirel · 2003-09-05 00:55 · Score: 2, Interesting

it's better for programs to abstract data like that, the fs should only to provide access to the medium, nothing else.

Re:i think by Anonymous Coward · 2003-09-05 01:06 · Score: 0

Yeah man, I mean totally far out...
Its just like the drugs man, they give you access to the medium man, its like nothing else
Re:i think by eric76 · 2003-09-05 01:09 · Score: 1

I agree completely.

On the surface, using a database type file system where files are just objects stored in the database along with other things seems like a great idea.

But I think that the result will probably be less resilient to damage and result in an increased possibilty of losing your data or finding them corrupted.
Re:i think by laird · 2003-09-05 01:39 · Score: 4, Interesting

I disagree, strongly. Files are an artifact of a bunch of bad implementation decisions when stripping Multics down to produce UNIX. What programmers want to be able to do is manipulate data structures and store them persistently. What files force you to do is waste tons of time writing code to take your data structures and write them out as sequences of bytes and read them back in.

One OS that solved this nicely was NewtonOS. If you wanted to manipulate persistently stored data you opened a "soup" that contained objects. So if you wanted to, say, set up an appointment with someone for lunch, you could find the person in the address book "soup" and then create an entry in the databook "soup" recording the appointment, which would immediately appear in all other apps that dealt with appointments (because app's accessed the same data structures, and were notified of changes so that they could update). So your data was not trapped in a particular application's proprietary format, and users weren't forced to learn the artificial concept of a "file" but instead could think about "my appointments" or "my address book".

If you haven't tried it, don't knock it. As a developer, and as a user, it was wonderful -- much more straightforward than "files" and "directories".

--
Enable 3D printed prosthetics!
Re:i think by micromoog · 2003-09-05 01:51 · Score: 1

One OS that solved this nicely was NewtonOS. If you wanted to manipulate persistently stored data you opened a "soup" that contained objects. So if you wanted to, say, set up an appointment with someone for lunch, you could find the person in the address book "soup" and then create an entry in the databook "soup" recording the appointment, which would immediately appear in all other apps that dealt with appointments (because app's accessed the same data structures, and were notified of changes so that they could update). So your data was not trapped in a particular application's proprietary format, and users weren't forced to learn the artificial concept of a "file" but instead could think about "my appointments" or "my address book".
Sounds similar to PalmOS, too.
Re:i think by metalhed77 · 2003-09-05 02:07 · Score: 1

how is this different with having two directories

addressbook and addressbook/appointments

--
Photos.
Re:i think by glwtta · 2003-09-05 02:20 · Score: 3, Insightful

and users weren't forced to learn the artificial concept of a "file"
Um, artificial as they may be, these so called "files" have been around for some time, in fact long before computers. Users can quite intuitively understand the concepts of "file" and "folder." I really think you are trying to make the difference seem greater than it actually is. (on the user side, that is)

--
sic transit gloria mundi
Re:i think by jeremyp · 2003-09-05 02:56 · Score: 1

Except that the computing meaning of the term "file" is different from the general usage meaning of the term file. Out there in real-space a file is a collection of documents relating to one subject, pretty much synonymous with what in Unix is a directory. e.g. my personnel file is probably a folder or section in a filing cabinet which contains a number of documents such as my contract of employment and absentee record etc.

--
All I want is a secure system where it's easy to do anything I want. Is that too much to ask ~~ Randall Munroe
Re:i think by mangu · 2003-09-05 02:57 · Score: 2, Insightful

What programmers want to be able to do is manipulate data structures and store them persistently

What programmers want is to be able to manipulate data. Period. What's so good about unix is that everything is a "file"; for instance, you can manipulate data coming from a sound card with exactly the same code you use to manipulate a sound file. You can't do this with this so-called "storage".
Re:i think by tomkins · 2003-09-05 03:04 · Score: 0

I disagree, strongly. Files are a benefit of some good implementation decisions when stripping Multics down to produce UNIX. What programmers want to be able to do is manipulate data and store it persistently. What files let you to do is take your data structures and write them out as sequences of bytes and read them back in.

One OS that solved this nicely was UNIX. If you wanted to manipulate persistently stored data you opened a "file" that contained data. So if you wanted to, say, set up an appointment with someone for lunch, you could find the person in the address book "file" and then create an entry in the databook "file" recording the appointment, which would immediately appear in all other apps that dealt with appointments (because app's accessed the same files, and were notified of changes so that they could update). So your data was not trapped in a particular application's proprietary format, and users weren't forced to learn the artificial concept of a "soup" but instead could think about "my appointments" or "my address book".

If you haven't tried it, don't knock it. As a developer, and as a user, it was wonderful -- much more straightforward than "soups" and "sandwiches".
Re:i think by edwdig · 2003-09-05 03:17 · Score: 1

Another interesting approach was Virtual Memory files in GEOS. It was coded for the 8086, so the functionality of an MMU had to be emulated in software. Here's basically how it worked. You'd allocate a block of memory within the VM file. You'd be given a block handle, which you'd lock when you wanted to use it. When done, you'd mark the block dirty and unlock it. Dirty blocks would periodically be written into the file. If you wanted to make a linked list of blocks, rather than storing pointers, you'd store the block handles, as they would be persistant every time the file was opened. The system added very little overhead, and meant that the data you were working on in memory would be synced to disk.

If this was written for a 386, you'd be able to eliminate the need for marking things dirty, and probably for locking and unlocking blocks - you could just use the CPU's segmentation features to automatically do that, although it would result in a lower limit of how many blocks you could have.

There were some nice advantages to this. When a block was first marked dirty, the OS would back it up within the VM file. When a save operation was done on the file, all the backup blocks would be removed. The result of this was all applications using VM files for documents would automatically get an auto-save feature, with the ability to revert to the last time the user chose to save.

Documents would open and save practically instantly due to this design. I remember working on multi-megabyte files on my 286 with 1 meg of RAM without any noticable speed difference over a small file.
Re:i think by Anonymous Coward · 2003-09-05 03:21 · Score: 0

Not quite.

There's a thing called a "file folder", remember? It's a "folder" that holds "files", which are the little pieces of paper you care about.

So "folders" contain "files", which are things you store stuff on.

Seems simple enough.

Now I need to learn SQL to find stuff, and if I'm not looking for something in particular, I just do a *.pdf search and wade through the mess? Why don't I just slam my head into a brick wall until I die from trauma? That seems more convenient.
Re:i think by ahfoo · 2003-09-05 04:19 · Score: 1

That part of it made me think of fair use. I know, one track mind here. But I think it's quite relevant.
When you present the metaphor of files and folders to a judge in a court of law, it tends to oversimplify what's really going on to the detriment of fair use. A completely different file system would provide a concrete example that complicates this metaphor and I think it could only help because the oversimplification of how data operates in a computer is a huge part of how bullshit anti-technology laws get passed and upheld by the courts.
But I was a bit put off by the part about cutting edge techniques from the linguistics community. I suppose I have a chip on my shoulder when it comes to linguistics. Linguistics, sociology and psychology are all rather unfortunate fields of study in my opinion. I think Adorno and Benjamin are with me on that point, but that's another story.
Re:i think by spitzak · 2003-09-05 04:20 · Score: 3, Interesting

"Files" are not a bad idea. It is nice to have an interface of commands that is limited in size and easily serialized (ie open/read/write/seek). If Unix had instead mmap'd files in it's original design there would probably not be transparent access to network file systems or many of the other things we take for granted today. So the design of files was actually a huge win.

1. The primary problem is implementation. Filesystems today are designed to store small numbers of very large files (ie more than 1K in size). Anybody who wants to store "objects" that are smaller than about 1K in size (like if you are implementing a "registry", for instance) is forced to write or use a database program, with needless complexity, to force all this data into a single file, so that it can be stored efficiently. What we need is a design where tiny files (like 4 bytes) can be stored efficiently.

Supposedly ReiserFS addresses this, but it is not clear if it does the necessary level of compression: ideally if you had 100 files with the same 50-bytes name and 1 byte stored in them, all those names would be in the same 50 bytes on the disk.

Sadly NOBODY seems to be trying this, and keep spouting "attributes" and "registry" and "config file". Those are all work-arounds for poor file systems.

All files must have the capability of being a "folder" and having subfiles. Any time anybody says "attributes" this should mean this sort of subfile.

2. The other problem is the blinders so people believe the "filename" is some sort of user-friendly data. This leads to brain-dead ideas like "case independence" and "wide characters" and the fact that certain bytes like "/" and zero are disallowed. This requires programs to cook data in strange ways to use it as indexes into the filesystem. This used to be true of the *data* in old systems, and we know now how horrid that was (only a rudimentary piece of that old stupidity remains in Windows text/binary distinction but I hope newer Windows systems will move that out of the kernel).

The filesystem should identify files with a counted length of bytes, just like the data in the file. In fact "name" should be a subfile of any file, and you rename it by writing a new "name". I don't think this can be solved without fixing existing filesystems.

(for "user friendly" names some form of quoting is going to be necessary. Since Windows has made "\" useless I would use that for quoting. "\0" is a null, "\\" is a backslash, "\/" is a forward slash. Just "/" itself indicates a break between hierarchy levels. For semi-Windows compatability you can also make just "\" followed by an unassigned code also mean a break between hierarchy levels.

3. The other thing that is needed (but could be done atop existing implementations) is to change the model of files. They should be "atomic" in that when you open a file for writing, you get an empty file, and this is invisible to any other program. The file only appears at the moment you close it, and only to programs that then open it for reading (programs with the same name already open continue to see the old file). Current files where you can replace a block in the middle are a special case that only a few programs use, and support can be operating-system dependent (and while you are at it, try making it so you can insert or delete data and not just overwrite).

4. As for "database" this can all be done with symbolic links (which can be implemented atop any file system which efficiently stores identical small blocks of data).
Re:i think by AchilleTalon · 2003-09-05 04:56 · Score: 1

Well, don't you think files are rather water waiting for someone to add some other ingredients to make the soup?
Files are objects that can support any kind of datastructure you wish them to support if you write the appropriate interfaces.
So, at my sense, it's not files vs soup (or any other kind of structure). It's: Hey, here is another datastructure, do you find it useful?

--
Achille Talon
Hop!
Re:i think by Anonymous Coward · 2003-09-05 05:04 · Score: 0

"file" is a collective noun meaning an orderly succession. Think "rank and file". The use of the term in computing was never particularly congruent with real-world files, and in my experience has only ever confused new users.

"file" is also a verb meaning approximately "to store in an orderly manner for later retrieval", the meaning derived over a short time when people started storing their documents in "files". But in the real world, one thing it never meant is one single document. That's a novel usage created out of thin air at PARC, and probably out of a misunderstanding similar to yours.

"file folders" don't store files; they store documents *in* files. In modern business usage they (the folders) essentially are files ("get me the Krishnamurthi file"), which is what the OP was correctly pointing out.
Re:i think by NuShrike · 2003-09-05 05:09 · Score: 1

A lot of your issues have been solved IMO with HFS and resource forks. I'm not sure if this is your "attributes", but it's been done.
Re:i think by Anonymous Coward · 2003-09-05 05:12 · Score: 0

Get the chip off your shoulder. The social sciences you describe have provided many valuable insights, and continue to. Is there a lot I don't like in them? Sure, but that's the growing process of a relatively young science. I'm not particularly sure I agree with the Copenhagen interpretation either, but I don't assume physics is an "unfortunate field of study".

The reason so few people object to fields like mathematics (including mathematical models in theoretical physics - dimensions in string theory, say), is that they're 100% abstraction with no actual connection to the real world. The social sciences piss people off because they're that much harder to disconnect from reality. That means they're doing their job.
Re:i think by iamacat · 2003-09-05 05:18 · Score: 1

Databases are space-efficient for small objects, have primary keys ("filename as a counted array of bytes) and cerainly support atomic updates. They can also be stored in a raw disk partition if you dislike double overhead. On the other hand, they support atomic updates on a larger scale than writting a single attributes and fast arbitary queries (whereas you would have to search thousand of files to build your symlinks every time the user asks). So why reinvent the wheel?
Re:i think by spitzak · 2003-09-05 05:52 · Score: 1

The main difference is that I think the interface that opens these "resources" MUST be the same interface that opens files. IE if you open the file foo with open("folder/foo") then you must be able to open the resource "bar" in "foo" with open("folder/foo/bar"), and NOT with open_resource(fd,"bar").

I should have added "resource" to the list of evil words like "attribute" and "registry". There should be no distinction of these from files either.

Also I realized I left something off about the "relational database" as symbolic links. There needs to be "backwards links". In theory these could be maintained at user level, but filesystem support would be ideal and probably more efficient. If "L/A" is a symbolic link to "B" then there is a file under B that is a symbolic link to "L". There also need to be queries that list files that satisfy several link queries at once. Otherwise you just have a hierarchial database. My important point was that all files in existence must be identified by a hierachial name that is simply derived from one of the database queries that returns the same one.
Re:i think by ahfoo · 2003-09-05 06:38 · Score: 1

Yeah, I'm aware that there are two sides to everything, even --ugh, I hate to even write the prhase-- social sciences. My dear Ma majored in Linguistics at Berkeley in the 60s, so we've argued this for decades now.
I do get the point that both good things and bad things can come from the same origins. But I can't get over the emotions I have towards things that touch my personal life deeply. These academic institutions that call themselve the social sciences are the underpinnings of current penal system. This is not something that can be casually ignored by someone who has wittnessed lives destroyed, literally terminated, by the abuses of that system.
And if that's not bad enough, how about generative grammar! Jesus freakin' Christ. You have to have your head so far up your ass to even consider that as a useful topic and people are still, to this day, writing their theses on that crap. I know because I proof read them every week.
How about psycohtherapy? The social sciences have some serious problems and there have been so many years to address them, but I just don't see it. It's just business as usual.
I admit I have a chip on my shoulder here. I think I'm in good company historically on that. But I know there are good sides too. I can think of sex therapy, especially in the 70s, as having been an important and useful field of study, but if you look at what happened institutionally, that kind of research got almost totally ignored once the 80s rolled around. I think if you take the big picture on the institutional role of the social sciences it's bad news. It could be different, but my argument is that the fact is --it's not.
I would argue that both the social sciences and the hard sciences should be de-emphasized in favor of creative arts which, I believe, is the real source of technological and social advance.
Re:i think by outcast36 · 2003-09-05 06:51 · Score: 1

no soup for you

--
Technology Consulting & Free Downloads
Re:i think by soft_guy · 2003-09-05 07:12 · Score: 1

I used to write Newton software for a living. My favorite part was telling other engineers I worked with "The Newton doesn't really have the concept of a file."

--
Avoid Missing Ball for High Score
Re:i think by rwise2112 · 2003-09-05 07:26 · Score: 1

Filesystems today are designed to store small numbers of very large files (ie more than 1K in size). Anybody who wants to store "objects" that are smaller than about 1K in size (like if you are implementing a "registry", for instance) is forced to write or use a database program, with needless complexity, to force all this data into a single file, so that it can be stored efficiently. What we need is a design where tiny files (like 4 bytes) can be stored efficiently

Later versions of Stacker, back in the DOS days when hard drives were small, actually did this. It compressed the whole volume into one file and kept an index to the locations of each file for extraction. Zero slack space.

--

"For every expert, there is an equal and opposite expert"
Re:i think by Anonymous Coward · 2003-09-05 07:40 · Score: 1, Insightful

Except Unix never decided on a standard representation of these data objects. That lead to all sorts of bizarre and flaky program problems when the files were just so, and a very low level of integration between different programs.

Any attempt to standardize file data objects (recently, XML) inevitably devolved into a discussion of "MY Favorite Format" and "Look Ma, I wrote another parser!" and "Awk skilz make my dick look bigger" and "At least it's not The Registry!".

Thus, we must declare Unix's approach a failure: Not so much on the technical level and the pure genius of it's simplicity, but a failure on the social level in that we gave Unix 20 years to play with it's file streams and it did not come back with the applications that people wanted to use.
Re:i think by AntiOrganic · 2003-09-05 08:26 · Score: 1

What, exactly, is keeping you from mounting different filesystems at once?
Re:i think by spitzak · 2003-09-05 09:09 · Score: 1

If stacker modified DOS so that programs that opened and read/wrote files and listed directories would work without change and see these files, then this is the type of solution I am talking about.

The exact method is unimportant. What is important is that the interface for "small files" and for "attributes" and "resources" and whatever be *EXACTLY THE SAME INTERFACE* as for files. Ie open() is used to get all pieces of data.
Re:i think by cpeterso · 2003-09-05 09:11 · Score: 1

If Unix had instead mmap'd files in it's original design there would probably not be transparent access to network file systems or many of the other things we take for granted today.

Why would mmap'd files prevent transparent network access? mmap'd files is just a shortcut to random file I/O. Plus there are systems that support distributed virtual memory.

--
cpeterso
Re:i think by alext · 2003-09-05 10:35 · Score: 1

Right. A very mainstream business OS - Stratus VOS - had memory-mapped files that worked over the network (i.e. mapping to a remote file). This was in keeping with the remotable-everything philosophy (files, terminals, tape drives, queues etc.) that I believe was shared by the Apollo Domain system, also of mid 80s vintage.
Re:i think by spitzak · 2003-09-05 11:42 · Score: 1

Copy-on-read memory mapping may have worked. However I would expect at the time they would have done shared read/write memory pages and this would have been a nightmare to emulate on a networked file system.
Re:i think by laird · 2003-09-08 05:50 · Score: 1

Several big differences:

First, I deal with objects as objects, not as text strings. That is. I don't write code to parse a text file, or re-write the text file if data has changed. That eliminates tons of persistence code that I usually have to write. So not only do I write way less code to do the same work, but the application runs faster (no wasted time serializing/deserializing all of the data).

Second, I deal with the information as a structured database (object database, not relational), rather than as a bitstream. So, for example, I can say 'given me all appointments for next wednesday' and get back objects. And I can update one object and since it's persistent it's automatically stored. I don't need to write code to manage 'dirty bits', or append to files, or whatever. Again, my app is simpler, and runs faster.

Thirdly, in a PDA it makes more sense. If I were using files, I'd have to have two copies of my data (one in the 'filesystem' and one loaded into the 'application'). In the NewtonOS, all data was stored in persistent memory (SRAM, Flash RAM) so I could work with it directly without having to 'load' any files into 'memory'.

Fourth, there's a consistency of access. All address book entries are objects of a particular type, so any application can manipulate the address book. Since everything is an object, there are no proprietary file formats to parse, etc. (though of course you could implement your own private classes if you want to). The result is that applications largely manipulate shared data stores.

Yes, you could implement all of this over 'files' but at that point you aren't using files any more, you're using a high level object-oriented API (which is how things should be) that happens to use files for persistence . That's a good thing, but shouldn't that sort of capability be provided by the operating system so that everything can interoperate, rather than having a custom, incompatible persistent layer for every application on your computer?

--
Enable 3D printed prosthetics!
Re:i think by laird · 2003-09-08 11:50 · Score: 1

Worked just fine on the Apollo workstations, back in the 80's. Really, it's just the same as keeping database records consistent when multiple applications are accessing the database at once.

--
Enable 3D printed prosthetics!
Re:i think by captaineo · 2003-09-10 10:44 · Score: 1

Atomic file transactions would be great. I think this is in the plan for Reiserfs v4. It would eliminate SO many little race conditions here and there... (one of which being the program-crashes-while-writing problem, where you end up with a non-zero but otherwise corrupted file)

AFAIK the only things you can count on being atomic right now on Unix are rename() (within the same mount) and small write()s. (theoretically open(O_EXCL) and link() too, but they break on NFS I think). Actually, a Unix-compatible network filesystem that supports atomicity guarantees would be awesome (we've gotten so used to the horrors of NFS it's hard to imagine life without it now...)

Windows' filesystem by mic256 · 2003-09-05 00:56 · Score: 3, Informative

I think Longhorn will be the first Windows with a database filesystem. It will probably be based on SQL Server

Re:Windows' filesystem by Serapth · 2003-09-05 01:00 · Score: 2, Interesting

Yes, from what I have read, that is true. MS plans to use SQL server 2k3 as the underlying technology for the file system for longhorn. What I just dont get though... if SQL is going to be used as the file system... then every Longhorn PC in a sense either needs to have SQL ( or MSDE ) or needs access to a SQL server which seems unlikely as you would bottleneck on the network speed.

What then happens to SQL as a MS product? If its built in to every OS, why then would anyone buy it. Ive seen MS build other peoples apps into their products, but never seen them do it to their own. Are they actually going to kill off a profit centre?
Re:Windows' filesystem by cyclist1200 · 2003-09-05 01:03 · Score: 3, Informative

The filesystem will be based on SQL Server 2003, but it won't be a fully functional version of SQL Server.
Re:Windows' filesystem by lurvdrum · 2003-09-05 01:09 · Score: 3, Interesting

Who owns the patent on this type of filesystem implementation - there must be one? Microsoft, IBM, Seth...SCO?
Re:Windows' filesystem by jsse · 2003-09-05 01:19 · Score: 1

It will probably be based on SQL Server

Oh great! Next time a user calls and asks where's his excel sheet he saved yesterday, instead of teaching him to use 'Find' over phone I can just tell him: "It's not lost, it's just invisible." :)
Re:Windows' filesystem by Zocalo · 2003-09-05 01:25 · Score: 4, Interesting

Good guesses. Replace "SCO" with "Apple" and you probably have the right triumvirate. All three were working on this in 1995 or so - Microsoft was going it alone with "Cairo" (should have been Win2K) and IBM/Apple were working togther on "Taligent"/"Pink". Neither project saw the light of day, although whether this was because of the system requirements or a marketing decision based on the paradigm shift is a matter of opinion.
The idea was probably stolen from Xerox Parc in the first place, of course.

--
UNIX? They're not even circumcised! Savages!
Re:Windows' filesystem by tolan-b · 2003-09-05 01:27 · Score: 1

Didn't they say that it'll now just have metadata indexing from the SQL db rather than actually using it for storage as they were originally going to? For the next Win iteration at least.
Re:Windows' filesystem by FatRatBastard · 2003-09-05 01:41 · Score: 1

If they do there actually may be some prior art. I actually remember back in the 80s the guys at ICD (of SpartaDOS fame) were planning on implementing a DB engine into SpartDOS X. It never made it in, but I know they did some work on the implementation.

Speaking of which, is there anyone who has a copy of SpartDOS X that they'd like to unload? I'm trying to rebuild my 8bit Atari set up and its the last bit that I need (just picked up the MIO).
Re:Windows' filesystem by uwbbjai · 2003-09-05 01:47 · Score: 1

If the filesystem uses a database backend, what will happen if the database screws up? Does it mean I'll lose my files? It'll also render my good'ol dos boot disk useless.
Re:Windows' filesystem by guile*fr · 2003-09-05 01:51 · Score: 1

nothing you wouldnt have sufferded if your filesystem screwed up.
Re:Windows' filesystem by thermostat42 · 2003-09-05 01:53 · Score: 1

1995? Hmmm IBM's OS/400 native filesystem has been a database (now nominally a varient of DB/2) for decades.

--
no comment
Re:Windows' filesystem by Pfhreakaz0id · 2003-09-05 01:56 · Score: 5, Informative

My guess is it will be something like the MSDE engine. So it will be limited. For those who don't know, MSDE is just an embedded, single-user version of the SQL engine. I worked on an app once that used it for laptop users who were offline from the network and would have a copy of the database to search and enter orders in, which would auto-replicate with the master SQL server when it got back on the LAN. It was pretty neat.

--
DO NOT DISTURB THE SE
Re:Windows' filesystem by ReelOddeeo · 2003-09-05 02:02 · Score: 2, Insightful

I think Longhorn will be the first Windows with a database filesystem. It will probably be based on SQL Server

First, about being first. Microsoft will have the First GUI. Microsoft will have the First internet web browser. Microsoft will have the first 32-bit clean API. Back in 1982, some big fat PC magazine (not Byte, but one with PC in the name) said that MS-DOS 2.0 would be the First OS to have a herarchical filesystem! I think I could go on and on, but I trust my point is clear regarding Microsoft having the first database filesystem which they most certianly do not. (Can you say BeOS.)

I think Longhorn will be the first Windows with a database filesystem. It will probably be based on SQL Server

Second, Microsoft wants their database based fileserver to be reliable. So maybe it will be secretly based on MySQL. :-) Ooops, wrong license. I meant PostgreSQL.

--

Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
Re:Windows' filesystem by andy+landy · 2003-09-05 02:29 · Score: 1

Ive seen MS build other peoples apps into their products, but never seen them do it to their own. Are they actually going to kill off a profit centre?
I've seen them do it. It's called Internet Explorer in Windows '98. Okay, so it's not exactly as if IE was a major revenue earner (They give it away), but they could still cripple anything they put into their consumer OS - Look at the Remote Desktop stuff in Win XP Pro, it's the same technology used in Windows 2000 Advanced Server to use as a Terminal Server, just licensed to be far less useful, and priced accordingly.

--
perl -e 'print "Just another Perl newbie\n";'
Re:Windows' filesystem by Overly+Critical+Guy · 2003-09-05 02:34 · Score: 1

WinFS will be much different over the limited BeOS filesystem. Read up on it.

--
"Sufferin' succotash."
Re:Windows' filesystem by Zocalo · 2003-09-05 02:46 · Score: 3, Insightful

Yeah, but as I mentioned in an earlier post, *all* filesystems are databases of some type, it's just a matter of context. Generally, when someone says a "database filesystem" today, what they actually mean is "a relational database driven, virtual filesystem providing an infinite variety of views onto a soup of metadata". I think I prefer the former and leaving the rest up to inference, but I'm sure that when these new products finally ship the marketroids are going to think otherwise.
I do deserve my wrists slapping though... I'd completely forgotten about BeOS! For shame!

--
UNIX? They're not even circumcised! Savages!
Re:Windows' filesystem by Thing+1 · 2003-09-05 02:48 · Score: 1

Can you say BeOS.

You are absolutely correct; however, I believe the grandparent post was saying that Windows has never had this functionality until Longhorn:
I think Longhorn will be the first Windows with a database filesystem.
(It's fun to selectively bold.)

--
I feel fantastic, and I'm still alive.
Re:Windows' filesystem by simonecaldana · 2003-09-05 03:17 · Score: 5, Funny

> The filesystem will be based on SQL Server 2003, but it won't be a fully functional version of SQL Server.

you mean it will be a standard version of SQL server? :)
Re:Windows' filesystem by duliano · 2003-09-05 03:30 · Score: 1

I think a SQL based FS would be fine for an application that you are mentioning but what about more write intensive based needs like graphics and video. Could you imagine downloading your home video over a Firewire and rendering it? I don't think that having the intermediate layer of an MSDE would work in those cases. I also wonder how this would impact our SAMBA shares on our linux boxes?
Re:Windows' filesystem by SatanicPuppy · 2003-09-05 03:38 · Score: 1

Sure, all file systems are database driven. But, in most cases, it's just a flat table with a couple of linear searches. Like a library.

The benefits of switching this system to one that is SQL based elude me. I love SQL. I love manipulating data. That's it's strength; relating things that aren't, at first blush, related. How the hell is this going to help my harddrive? I'm not going to want queries viewing my programs in new and interesting ways.

The only use I can see for this is for some kind of networked, non-private, system, where you would want to be able to find things not normally available to you, on someone elses computer.

--
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.
Re:Windows' filesystem by deque_alpha · 2003-09-05 03:38 · Score: 2, Informative

uhh... he said first Windows with db filesystem, not first OS. Read more carefully before you go on crusade.
Re:Windows' filesystem by Anonymous Coward · 2003-09-05 03:53 · Score: 0

The Longhorn file system isn't going to be implemented -as- a database. It will have a wrapper that is like a database that will interact with the filesystem.
Re:Windows' filesystem by CrowScape · 2003-09-05 04:10 · Score: 1

What? You mean Windows is not synonomous with OS?

--
common sense: noun
What those who are ignorant of the subject matter think; usually wrong.
Re:Windows' filesystem by capnjack41 · 2003-09-05 04:19 · Score: 1

I can't wait until my mom calls me complaining about how she can't email pictures to my sister "while the transaction server is in firehose mode"
Re:Windows' filesystem by Pfhreakaz0id · 2003-09-05 05:18 · Score: 1

I wondered too, about things like video capture and streaming that were pretty dependent on HD write speed.

SAMBA is an interesting one... not being much of 'nix guy... I would assume there would have to be an emulation layer for that in SAMBA, like it emulates all sorts of other stuff.

--
DO NOT DISTURB THE SE
Re:Windows' filesystem by TCM · 2003-09-05 05:31 · Score: 1

I think [...] will probably [...] Score: 5, Informative ...

--
Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
Re:Windows' filesystem by Anonymous Coward · 2003-09-05 05:36 · Score: 0

Xerox DocuShare software, which sounds very similar to WinFS, uses the MSDE engine for its file storage. Xerox analysts have stated a limit of approximately 400,000 object.

This may be sufficient for Docushare in which you selectively add documents into the system (don't want Docushare to be filled with Temporary Internet Files, System Files, etc.), but I have Win2000 Servers with 700,000+ files.

For Home versions of Windows, WinFS probably will use MSDE. For Server versions, it will have to be able to handle an enormous/unlimited amount of files - aka SQL Server.

Xerox's (I am not affiliated) Palo Alto Research Center (PARC) is known "As the Birthplace of technologies such as laser printing, Ethernet, the graphical user interface..."
You can read more about Xerox in their Fact Book
Re:Windows' filesystem by TheCrazyFinn · 2003-09-05 05:45 · Score: 1

Try Palm.

This is BeFS. Which is now owned by Palm.

--
"You've got an invalid haircut" -Warren Zevon - Life'll Kill Ya
Re:Windows' filesystem by TheCrazyFinn · 2003-09-05 05:47 · Score: 1

And Newton Too (which is of course an Apple tech.)

--
"You've got an invalid haircut" -Warren Zevon - Life'll Kill Ya
Re:Windows' filesystem by KJACK98 · 2003-09-05 06:10 · Score: 1

This is actually very important technology, that we in the open source community must bring to light of day ASAP. Reason being is most likely Microsoft will try to patent these types of ideas for their next generation OS, so we must remain one step ahead of them, and have plenty of prior art to protect ourselves. Going into the future, litigation will be the only weapon microsoft will have against us.
Re:Windows' filesystem by pooh666 · 2003-09-05 07:56 · Score: 1

Can you say OS/400? Not even remotely close with BeOS.
Re:Windows' filesystem by ParisTG · 2003-09-05 09:46 · Score: 1

I do deserve my wrists slapping though... I'd completely forgotten about BeOS! For shame!
It's ok. So has everyone else :).
Re:Windows' filesystem by duliano · 2003-09-05 13:59 · Score: 2, Interesting

Also, although I agree with you regarding your filesystem to database comparison, running a rdbms engine --even if it is stripped down creates yet another abstraction layer between the user request and the hardware. I know from using SAP on high volume transactions systems that even Oracle's dbwr's (or even db_ioslaves) can get backlogged during periods of high write activity. I think I am going to take a "wait and see" approach with this type of filesystem.
Re:Windows' filesystem by duliano · 2003-09-05 14:05 · Score: 1

I wonder if you could use a SQL command like "Select * from (hd01 or hd02..) where filetype = '.xls' :)

It could mean some additional job security :)
Re:Windows' filesystem by cyclist1200 · 2003-09-07 13:39 · Score: 1

Actually I was thinking of the Developer's Edition! :)

Re:Ahead of the game. by Serapth · 2003-09-05 00:56 · Score: 1, Informative

What an incredibly dumb thing to say... this is exactly what Longhorn is going to do, and they announced it well over a year ago!

Re:Won't compile :( by Tirel · 2003-09-05 00:57 · Score: 1

storage-item.c:7:44: libpq-fe.h: No such file or directory storage-item.c:8:28: libpq/libpq-fs.h: No such file or directory

sir, you dun have libpq!

so is everyone copying BeOS by Anonymous Coward · 2003-09-05 00:57 · Score: 4, Interesting

It's really a sad that there was a perfectly good implementation of database file system, but the company wasn't able to topple a monopoly and got squashed. MS really should have just bought BeOS and ported everything over to it. They could have just called it LongHorn and released it this year instead of waiting until 2006.

Re:so is everyone copying BeOS by Anonymous Coward · 2003-09-05 01:21 · Score: 0

Eugenia? Is that you?
Re:so is everyone copying BeOS by Anonymous Coward · 2003-09-05 01:28 · Score: 0

So your saying that Palm could do this? IIRC Palm now owns Be
Re:so is everyone copying BeOS by diamondc · 2003-09-05 02:05 · Score: 1

do you really think BeOS was the first company/person to think of a filesystem as a database? They just wrote ONE implementation of that idea. So it's not copying (and how can Storage copy BeOS's filesystem when there is no src available for BeOS...?)

--
"I keep looking in the want-ads under 'revolutionary' but there don't seem to be any listings.. "
Re:so is everyone copying BeOS by Twylite · 2003-09-05 02:13 · Score: 4, Insightful
Summary of developments:
- BeOS has a good idea
- Microsoft announces a breakthrough in file system technology (around 1996), nothing happens
- newdocms announced on Slashdot in January 2003. Integrates with KDE, so no-one cares
- Microsoft announces WinFS plans for Longhorn. Slashdot decides that Microsoft sucks.
- Initial release of Haystack from MIT. Screenshot has XP interface so no-one gives a toss
- WinFS is reviewed, Slashdot has a flame war about file system layout, and concludes that MS sucks and a database file system is a stupid idea anyway and no-one wants one
- YEDFS (Yet Another Database File System) announced calling itself "Storage". Integrates with GNOME. FLOSS community bows and worships the superiority, leadership and sheer innovativeness of the application.
--
i-name =twylite [http://public.xdi.org/=twylite], see idcommons.net
Re:so is everyone copying BeOS by Anonymous Coward · 2003-09-05 02:57 · Score: 0

Eugenia? Is that you?
Re:so is everyone copying BeOS by jilles · 2003-09-05 03:00 · Score: 4, Interesting

Be-os deserves some credit for merging meta data with a file system. However, a real database goes a few steps further in terms of the ability to query, to do replication, remote data access etc.

Essentially, the Be-OS filesystem, while much richer than other filesystems, is still a filesystem. This Storage thing is a full blown SQL database in the first place.

Essentially a normal filesystem is a hierarchical database where as modern databases are relational or object databases. Relational databases have proven themselves for storing complex data over the past few years.

Some scenarios to give you a clue as to why the distinction matters: you can set up a database trigger to track changes in wordprocessor documents (i.e. automatically update some table with version info whenever you click save); you can involve external databases when doing a query on your own database (e.g. the imdb example in the Storage proposal, a tv guide); emulate a hierarchical file system by associating directory attributes with an object; emulate multiple orthogonal hierarchical filesystems; integrate security policies and encryption into the database (could also be used for DRM, I know this is a sensitive topic); make the objects themselves database records (e.g. contact information); use report generators and queries to dymamically generate complex documents (e.g. software documentation, financial overviews, etc.). Use special purpose software to browse specific types of information (e.g. a picture album, movie library or an old fashioned filebrowser).

--

Jilles
Re:so is everyone copying BeOS by the_truk_stop · 2003-09-05 04:15 · Score: 1

community bows and worships the superiority, leadership and sheer innovativeness of the application
Actually, while true in some cases, I think the community is more interested in the fact that there will actually be a product associated with this announcement. Has Microsoft come up with a truly ground-breaking database file system? No, but that doesn't stop them from talking about their vision(s).
Besides, I for one am not interested in having a database filesystem. I'd like application-layer functionality that's there as an option if I want it, but I'm against an all-out replacement to my exquisitely organized folders. ;)
Re:so is everyone copying BeOS by FooBarWidget · 2003-09-05 05:45 · Score: 2, Insightful

"WinFS is reviewed, Slashdot [slashdot.org] has a flame war about file system layout, and concludes that MS sucks and a database file system is a stupid idea anyway and no-one wants one"

Wrong! Slashdot concluded that WinFS will make computing soooo much easier that it will blow the competition out of the sky and that if Linux doesn't caught up fast it will die off.

Open your eyes people. Slashdot is not an anti-MS site anymore!!!
Re:so is everyone copying BeOS by Anonymous Coward · 2003-09-05 05:49 · Score: 0

Slashdot has become a Mac OS X advocacy site. This means its anti-Microsoft AND anti-Linux.
Re:so is everyone copying BeOS by gears5665 · 2003-09-05 06:31 · Score: 1

Well, if I took the time to care about MS, I'd probably be anti-MS. But it really has no effect on my life, so maybe you're right. Oh, Wait, there is the news that I can't avoid hearing about another microsoft virus/trojan/worm/flaw terrorizing people and threatening to bring down the world's infrastructure, but I just shake my head and wonder what the fuss is about.

Being keen on databases, this SQL filesystem is interesting to me, though.
Re:so is everyone copying BeOS by Anonymous Coward · 2003-09-05 07:33 · Score: 0

Haystack is a copy of Lndbase.

Helen's dBase.
Re:so is everyone copying BeOS by leandrod · 2003-09-05 07:46 · Score: 2, Informative

> BeOS has a good idea

No!
When Codd created the relational model, there wasn't the current Unix filesystem idea... the relational model was always intended to store data, and files are data.
System R, SQL and DB2 prototype, was intended to be the basis for IBM FS.
IBM realised this in OS/400, which being proprietary hasn't the influence it deserves.
MS also wanted Jet to be the building block of its OSs since its inception, that is, sometime before MS Access release.
> newdocms announced on Slashdot in January 2003

Sorry, but NewDocMS is based on SQLite, which is typeless and but a library... simply not good enough to be attractive. Storage is based on PostgreSQL, the real thing, and aims high.

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:so is everyone copying BeOS by the_greywolf · 2003-09-05 09:03 · Score: 1

actually, the BeFS driver had a built-in query system that was mostly based on boolean logic. i heard they considered implementing an SQL query system, but it was too complex for their timetable.

--
grey wolf
LET FORTRAN DIE!
Re:so is everyone copying BeOS by the_greywolf · 2003-09-05 09:05 · Score: 1

"Practical File System Design with the Be File System" by Dominic Giampaolo.

it has source.

--
grey wolf
LET FORTRAN DIE!
Re:so is everyone copying BeOS by Anonymous Coward · 2003-09-05 12:34 · Score: 0

Or maybe there's good reason for all of that and you're just missing everything because you're too focused on maintaining evidence of a bias.

The rest of us know there's a bias on /. and we can see past the trolls/flames. You don't have to point it out. Now go learn a little more about filesystems/databases and marketing to figure out why "Storage" is getting better critiques than the previous technologies you mentioned.

Finally something new to play with! by Trigun · 2003-09-05 00:58 · Score: 4, Interesting

Hopefully they plan on extending this to the networked environment, allowing multiple domain/realm file permissions, authentication, and encryption.

Anything to replace NIS and its bastard stepchildren.

Re:Finally something new to play with! by squiggleslash · 2003-09-05 01:48 · Score: 1

Well, there's already Netinfo, though it's always felt a little too like a Registry for my tastes. But it has virtually nothing in common with NIS, except achieving the same goals, and doing it better.

--
You are not alone. This is not normal. None of this is normal.
Re:Finally something new to play with! by evilviper · 2003-09-05 02:29 · Score: 1

Anything to replace NIS and its bastard stepchildren.

You consider kerberos a bastard step-child of NIS?

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Re:Finally something new to play with! by Trigun · 2003-09-05 02:39 · Score: 1

No, Kerebos is a great tool, but the underlying filesystem does not allow for multiple network file permissions. Getting multiple realms working with each other is still a bit of a kludge.

What I want to see is a filesystem without permission mapping that is stronly tied to the authenitcation scheme, allowing for several realms/domains to be mapped to the local filesystem.

Replacement for ls by Anonymous Coward · 2003-09-05 00:58 · Score: 3, Funny

SELECT * FROM MY_FILES

Re:Replacement for ls by Mwongozi · 2003-09-05 01:50 · Score: 4, Funny

SELECT * FROM Users WHERE Clue > 0
0 rows returned
Ah, humour.
Re:Replacement for ls by sharkey · 2003-09-05 02:36 · Score: 4, Funny

SELECT * FROM MY_FILES
WHERE TYPE = 'video/x-mpeg'
AND TITLE IS LIKE ('*tit*, *blonde*)
ORDER BY PERV_RANK

--

--
"Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.

Hmm by Anonymous Coward · 2003-09-05 00:59 · Score: 0

Does this mean that Linux will finally get rid of its insanely cryptic and esoteric Filesystem Hierarchy Standard?

Let's hope!

Re:Hmm by UnuMondo · 2003-09-05 01:03 · Score: 4, Informative

No, because doing away with the root filesystem, user stuff in /home, config files in /etc, and so forth would break a number of Unix standards Linux's big advantage of being able to run many Unix apps (if you compile from source) would disappear. Storage will apparently be an interface to the existing real filesystem. Joe User won't know the difference.

--
GPG Key ID: 8C444E97 Fingerprint: E7BA D851 9714 8D97 C4F9 1777 8168 6913 8C44 4E97
Re:Hmm by orb_fan · 2003-09-05 01:22 · Score: 2, Interesting

Not true.

This isn't a replacement filesystem, just a document-storage system - you won't be able to access your documents easily from a shell.

It is a good starting point though, once working, the next step would be to compile it into the kernel, so that you can create Storage partitions, etc. and be able to do something like:
cd music by U2
Now that would be cool
Re:Hmm by lawpoop · 2003-09-05 01:53 · Score: 1

With a fully transparent RBD filesystem, you can emulate those stupidly-named unix root folders.
Making reference to /usr/sbin is just another query to the RBD filesystem, which will return the correct files when properly setup.

--
Computers are useless. They can only give you answers.
-- Pablo Picasso
Re:Hmm by Anonymous Coward · 2003-09-05 03:10 · Score: 0

But will Slammer derivatives have a good time with all of everyone's data!
Re:Hmm by BohKnower · 2003-09-05 04:48 · Score: 1

With this system you can use any Filesystem Hierarchy you want and still having POSIX compliance.
Although I think the Filesystem Hierarchy Standard is the very usefull for power users (short directory names, each file in its places) it's unreadable for desktop users.

I can see it now. by Sphere1952 · 2003-09-05 00:59 · Score: 1, Funny

Command: please run current microsoft worm.
>

--
Big Brother Bush is doubleplus ungood.

Re:I can see it now. by p3d0 · 2003-09-05 01:05 · Score: 1

Huh?

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
Re:I can see it now. by popeyethesailor · 2003-09-05 01:55 · Score: 1

Nope. Just enable multi-master replication ;)
Re:I can see it now. by caluml · 2003-09-05 02:35 · Score: 1

From your sig: My comments do not reflect the opinions of my employer. You never know, maybe they do.

--
Get your own free personal location tracker

Re:Won't compile :( by Anonymous Coward · 2003-09-05 00:59 · Score: 0

If you don't understand what it means you probably shouldn't be playing with alpha-level software. It's not finding the PostgreSQL include files.

Re:Ahead of the game. by Anonymous Coward · 2003-09-05 01:00 · Score: 0

well, in fact, Microsoft keeps announcing that feature as part of the 'next release' since what... early '92?

Screenshots ? by makapuf · 2003-09-05 01:01 · Score: 1

I know, I'm the first to look for screenshots, but antialiased filesystems are a bit too much, maybe.

That's a good thing I think to separate filesystem and document storage. Better than vfs : either it's plain fs (simple == good for admin), or sophisticated document retrieval architecture with richer semantics than a tree (or graphs if you count links).

And then, do not let GUI apps show you the filesystem, only storage system.

Obvious advantages by tsetem · 2003-09-05 01:01 · Score: 5, Interesting

There's lots of advantages to this kind of system, especially if interfaces are written for other OS's (Windows, Solaris, OSX)

Networked file system. No more NFS/SMB hacks. Everyone accesses the data in a common way, and can access the same data
Integrated mime-types. No more relying on file extensions and other hacks. The mime-type (and subsequent viewer) is right there in the query
Integrated version control. Have and keep a history of all of your files as they were managed and maintained through their lives, as well as a history of who modified them. If this aspect could be enhanced with branching & merging, then would make other CM Systems (CVS, ClearCase) obsolete?

Of course it's only wishful thinking. I'd be nervous to see exactly how this integrates into other "Legacy" applications. I can also see be performance penalties since you are now querying a database, rather than looking at a simple file structure...

Re:Obvious advantages by dabadab · 2003-09-05 01:07 · Score: 5, Insightful

"Integrated mime-types. No more relying on file extensions and other hacks. The mime-type (and subsequent viewer) is right there in the query"

And how does that meta data gets to the db? Oh, right, it will rely on file extensions and other hacks :)

--
Real life is overrated.
Re:Obvious advantages by azaroth42 · 2003-09-05 01:08 · Score: 4, Insightful

Obvious disadvantages:

SQL is slow compared to things like BerkeleyDB

We already have journaled file systems that can save metadata (though not user defined, I think)

Your database becomes corrupt, you lose everything.

Sorry, give me something that gives me back my data -fast-. If I want to do selects for files, I'll use locate and xargs.

--Azaroth
Re:Obvious advantages by Anonymous Coward · 2003-09-05 01:25 · Score: 1, Funny

You are obviously too stupid to understand the technology being covered in this thread. Please refrain from further posting.
Re:Obvious advantages by noselasd · 2003-09-05 01:28 · Score: 1

Sure, ext3 in the 2.5 linux kernel allows you to store whatever meta data you need, and name them whatever you want. (well, there might be size limits though. SGI XFS shouldn't have size limitications iirc)
Re:Obvious advantages by BigGerman · 2003-09-05 01:33 · Score: 1

what you are describing does not have to be backed by a database.
If the "common way" is described right (by set of XML schema files maybe?), it does not matter where the data is actually stored. Location transperancy is not just network transperancy, it is also independence from the actual storage mechanics.
What I envision is that _anything_ _anywhere_ can be described and accessed in some uniform way (mime types, etc) whether it is stored on giant corp. Oracle database or a cellphone.
It is possible because the logic involved with manipulating on the storage is the same (find, describe, map handler application to the mime type, etc) regardless of storage origin.
has not www / HTTP taught us anything?
hey this is cool, I think I 'll start a project like that.
Re:Obvious advantages by Anonymous Coward · 2003-09-05 01:36 · Score: 0

We already have journaled file systems that can save metadata (though not user defined, I think)

BeOS's BFS and AtheOS/Syllable AFS can do this. XFS on Linux supports this but there is no standard in the Linux VFS to support the extended functionality. There are probably others.
Re:Obvious advantages by Anonymous Coward · 2003-09-05 01:43 · Score: 3, Insightful

XFS limits user-defined extended attributes to 32 KB. Big, but not unlimited.

Also, extended attributes are fundamentally broken because they're stored in the inode. They do not survive, for example, a copy operation. Worse, they do not survive an open/save cycle in most cases, because most programs do not write to open files. Instead, they open a new file under a temporary name, write the data into it, close it, unlink the original file, then rename the temporary file to the original file's name. That way the data is safe if the program or computer fails during the save operation. This creates a new inode for the file data, however, which means extended attributes go bye-bye.

Extended attributes are not the answer. I'm not sure exactly what the question is, but I'm sure extattr are not the answer.
Re:Obvious advantages by Fandarg · 2003-09-05 01:50 · Score: 1

And don't forget you have to vacuum postgresql databases regularly.

http://www.postgresql.org/docs/7.3/interactive/rou tine-vacuuming.html
Re:Obvious advantages by Anonymous Coward · 2003-09-05 02:01 · Score: 0

The problems you highlight are really issues with POSIX and the fact that as yet, there is no standardised way to handle access to extended attribute data. If there were, there is no reason why the VFS and standard file and shell tools could not handle extended attributes in a more ellegent manner.
Re:Obvious advantages by Sunracer · 2003-09-05 02:17 · Score: 2, Informative

But there are no file extensions to rely on in the first place. When a file is first created, it will be given a MIME type when it's put to the DB. And from there, the metadata will be transferred when retrieved/copied/whatever.

--
"The Internet, of course, is more than just a place to find pictures of people having sex with dogs." - Time Magazine
Re:Obvious advantages by Anonymous Coward · 2003-09-05 02:17 · Score: 0

What I want from desktop:

1)No open/save files (in the menu or elsewhere) Thats right: No file dialogs. I do not care if the document data is in memory or on the disk. The system should care about it and save chunks of data to update on disk. No difference between opened and closed document.

2) No "START" menu and no starting applications. So how to write a new letter? click on "Love letter" template (somewhere in your workspace hierarchy) and the application is started (you can right click if you have/want choice of "applications"). You have give the docunemt name, so no "New documend 4".

3) No closing documents. If you stop working on it, just remove it from your field of view from your view-field, the system itself will cache it (close) if you do not work on thet document for a while and free resources

4) Task bar == history of last opened documents (url locations, ...). Because from user's point of view there is no difference between opened document (running application) and file just stored in the disk, task bar represents the most recent documents with user interest. These are usualy loaded in RAM (but user do not need to know/care about it), system cases about effective caching.

5) No mail clients etc. It should behave and look to be integrated with your filesystem, you just see a directory "Mail" etc. The same files (like sent or receieved documents) can be seen also in other directories (ie sym links), ie. "Legal documents from my atorney" etc.

Roman
Re:Obvious advantages by Anonymous Coward · 2003-09-05 02:23 · Score: 0

And if you download a file with no metadata?
Re:Obvious advantages by evilviper · 2003-09-05 02:34 · Score: 1

And how does that meta data gets to the db?

User input, greping the actual file, hashes and net lookups (like CDs), etc.

Oh, right, it will rely on file extensions and other hacks

No, it won't. At the very least, it doesn't need to, even if it is poorly implimented that way initally...

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Re:Obvious advantages by ryanvm · 2003-09-05 02:41 · Score: 3, Interesting

You don't happen to be familiar with the Mac's old "fork" filesystem do you? Metadata was kept in a seperate file (or fork). It made downloading or transferring files with non-Macs a bitch.
Re:Obvious advantages by LWATCDR · 2003-09-05 02:43 · Score: 1

Could you have a rollback feature with type of File system?
If you did not commit the changes to the file before you did a close it would rollback to the old version?
That could be very handy and solve many issues. Posix would have to be extended to handle transactions at the file level but it could be done.

--
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Re:Obvious advantages by laird · 2003-09-05 02:43 · Score: 4, Interesting

"And how does that meta data gets to the db? Oh, right, it will rely on file extensions and other hacks :)"

Like it has in MacOS for 20 years -- when applications write files, they tell the OS the filetype. The only time MacOS looks at extensions is if it's dealing with files transferred from operating systems that don't have relevant metadata. Unfortunately, that would be nearly every other OS. :-) But if Linux started transferring filetype metadata that would be a nice step in the right direction.

--
Enable 3D printed prosthetics!
Re:Obvious advantages by jeti · 2003-09-05 03:09 · Score: 3, Informative

You are aware that almost all internet protocols transfer a MIME-type with each file?
Re:Obvious advantages by golgotha007 · 2003-09-05 03:22 · Score: 1

try to look at it this way:

what Virtual Folders are to Evolution, is what Storage is to the filesystem.

no more organizing everything into folders; shit can lay all over the place.. physical placement doesn't matter anymore.
Re:Obvious advantages by nutshell42 · 2003-09-05 03:25 · Score: 1

You are aware that almost all useable browsers have an option to ignore the MIME-type and to rely on the extension because the MIME-type is rubbish as often as not?

--
Don't think of it as a flame---it's more like an argument that does 3d6 fire damage
Re:Obvious advantages by Christopher+B.+Brown · 2003-09-05 03:33 · Score: 1

And how does that meta data gets to the db? Oh, right, it will rely on file extensions and other hacks :)

Perhaps in "Version 0.1."
It makes a lot of sense to put a fair bit of analysis into this, in the longer term, particularly if there is some intent to draw in some "full text search" information.
Thus, rather than assuming that .mp3 indicates MP3 data, it would be sensible to use /usr/bin/file to ascertain that the file signature is MP3 file with ID3 version 2.3.0 tag , and then use an MP3-specific tool to pull out the ID3 tags and store them in the database.
Similarly, if file finds that a file is a PDF of some version number, it should store that version number. If it finds a GIF with some particular resolution, or TIF tags, it should store that.
Heuristics will go a long way here; if there is heuristic logic for handling MP3s, PDFs, common graphics formats, HTML, DocBook SGML/XML, documents generated using Word, Excel, PowerPoint, OpenOffice.org, that can draw a LOT of useful metadata into the database.

--
If you're not part of the solution, you're part of the precipitate.
Re:Obvious advantages by Christopher+B.+Brown · 2003-09-05 03:38 · Score: 1
- SQL is only slower if you have written your applications in a manner that use it badly.
  That means that you have written an application which submits vast numbers of "network model" (not unlike CODASYL) queries to navigate through the data.
- Journalling of filesystems is no panacea; it is about as easy to lose a filesystem as it is to lose a database.
  If Mongol hordes ravage your server room, putting sword to the disk arrays, it makes no difference whether you are structuring your data using filesystems or using databases.
- If your filesystem becomes corrupt, you also lose everything.
  The only answer to this involves doing BACKUPS.
--
If you're not part of the solution, you're part of the precipitate.
Re:Obvious advantages by Anonymous Coward · 2003-09-05 03:49 · Score: 0

Now there's an autovacuum daemon on the way. It works now, but it's not yet integrated into the main source tree.

http://developer.postgresql.org/docs/pgsql/contr ib /pg_autovacuum/
Re:Obvious advantages by tetsuji · 2003-09-05 03:53 · Score: 1

Lots of different companies have tried this. Heck, Oracle has their OFS database-based filesystem, and it's just not all that exciting.
The most interesting thing I've seen in terms of filesystem technology lately is Reiser 4, which has been discussed here before. It supports extensible metadata and filters but doesn't bog you down like a database can or break backwards compatibility.

--
nuke the moon
Re:Obvious advantages by evilviper · 2003-09-05 04:02 · Score: 1

WTF? Either you replied to the wrong comment, or I just completely missed your point...

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Re:Obvious advantages by Anonymous Coward · 2003-09-05 04:23 · Score: 0

Almost all Internet protocols? Let's see:

HTTP: Yes
MIME-encoded attachments to email: Yes
UUEncoded attachments to email: No
SMTP: No
POP3: No
IMAP4: No
FTP: No
TCP: No
UDP: No
IP: No
IRC: No

None of this addresses the /real/ problem, which is that, in order to be transmitted with a MIME-type, you need some way of having the MIME-type to begin with.

Take Apache as an example. It supplies the MIME-type with every file served. By default, Apache discovers the MIME-type by looking at the file extension. That's right, it uses file extensions. How else can it supply the MIME-type? Well things like PHP have to supply the MIME-type to Apache. You know how PHP gets the MIME-type? By default, it's hard-coded as text/html in php.ini. What other options are there? Oh yes, Apache can use another way of determining the MIME-type. It can use "MIME magic" - it guesses from the first few bytes of the file. Hack.
Re:Obvious advantages by stefanlasiewski · 2003-09-05 04:27 · Score: 1

File extensions are also frequently rubbish, especially in the Unix world when there are frequently no file extensions.

--
"Can of worms? The can is open... the worms are everywhere."
Re:Obvious advantages by Anonymous Coward · 2003-09-05 05:09 · Score: 0

You are aware that almost all internet protocols transfer a MIME-type with each file?

You are aware that these mime types represent a "best guess" by the sending application (web server or mail client) most often based on..... extention?

And thus the guessing problem is moved from host, not solved.... sadly even "magic" based mime type guessing wont "fix" the problem to a level that allows normal users to never ever have to call for help in figuring out a filetype.
Re:Obvious advantages by mark_space2001 · 2003-09-05 05:23 · Score: 1

Weeeeell, not quite. The old Mac HFS kept "resources" in the resource fork, and data (regular binary streams, basically) in the data fork. Resources were things like menus, code chunks, icons, bit maps, etc. All the stuff that Windows does now but DOS didn't do then.
Resource forks are an ugly hack. Look at Apples current implementation of a filesystem on Unix. If you use the cp command to copy a file, you only get the data fork. You have to use the "ditto" command to actually copy the whole file. Ugh.
Alan Cox has a note about this on the kernel web site. He says that if anyone want to implement "multiple streams" on the Linux file system, that's fine, just do it at the user level (i.e., make a library that apps can link to). Leave the file as one single stream to applications like cp, ftp, etc. Can you immagine trying to explain to a nontechnical person why coping a file through say an HTTP browser only gets half the file because HTTP doesn't know what the "ditto" command does? The mind boggles.
Anyway, my point when I started this was going to be that file types in HFS where never stored in the resource fork. They're 4 byte strings that can have mnemonic names like 'TEXT', 'PICT' or 'GIFf' and they were always stored in the HFS equivalent of an inode. That way when doing a directory listing, the file type was right there with the file name and the OS could get right at it. If the file type was stored elsewhere, the OS would have had to fetch an additional disk block just to look at the file type. Since the original Mac shipped with slow 3.5 inch drives, that would have been bad. (No HD's back then, remember?)
The original Apple DOS did the same thing, storing the file type with the file name for easy lookup. In a lot of ways HFS built on DOS and improved it greatly, but some things (like resource forks) just were bad ideas.
There are a couple of unused, reserved fields in the ext2 and ext3 inodes right now, one 2 bytes, and one 4 bytes. If it were me, I'd hoark those for file type info, and point them at a table mime types in /etc/file-mime-types.conf . But that's just me.
Re:Obvious advantages by Anonymous Coward · 2003-09-05 05:54 · Score: 0

Or the OS/2 HPFS that allowed you to store extended attributes with a file. These also didn't transfer well when sharing data with non-OS/2 systems.
SOM 3 was supposedly going to allow developers to access files by querying the extended attribute data much in the same way that this "new" filesystem will.
Extended attributes also suffered from the problem that someone else brought up here. Who's going to take the time to update them?
Re:Obvious advantages by Mr.+McGibby · 2003-09-05 06:18 · Score: 1

Which is about 5% of the whole world, which is made up mostly of Windows users. So not as frequently as you seem to think.

And guess what, you're wrong about file extensions in UNIX land. People use them, and they use them often. They may not be 3 character Windows standard, but they do use them and often.

--
Mad Software: Rantings on Developing So
Re:Obvious advantages by curious.corn · 2003-09-05 07:16 · Score: 1

XML. Attaching an object to an email would require a serialized function to encapsulate it inside an xml wrapper. Interoperating with the rest of the world using MIME TYPE around the plain object to keep outlook users happy (or appending the correct extension to the file). Obviously an ftp downloader sould either use the MIME TYPE or extension to match it to a local store one but it could be done.

--
Mi domando chi Ã il mandante di tutte le cazzate che faccio - Altan
Re:Obvious advantages by Alan · 2003-09-05 07:31 · Score: 1

Point 1 sounds a lot like how the OS/2 desktop tried to do it. Good ideas though, I'm not sure about all of it, but this sort of thinking is what innovation is made of. Feel free to submit patches to the gnome/kde maintainers :)
Re:Obvious advantages by smallpaul · 2003-09-05 07:39 · Score: 3, Insightful

The file should be self-describing. It should have a header saying its type. You can never trust intermediary software to properly keep data and metadata together. The problem isn't just other operating systems. It is file formats like ZIP and prototols like FTP. Plus there is a problem that the file type a user gives a file on their computer may be just a means of triggering a bit of software (e.g. change a JSP file to HTML so it launches your HTML editor). But the intrinsic type of the file should not be corrupted by these user preferences.
Re:Obvious advantages by Chelloveck · 2003-09-05 08:33 · Score: 2, Informative

Amen, brother! You just can't rely on metadata stored separately from the file itself. If I ZIP a file, or transfer it via XMODEM, or copy it onto an obsolete FAT-formatted floppy, that file should retain all it needs to be usable.

Some metadata is bound to be lost, such as its modification time or even its filename. If you can afford to lose this sort of metadata, then go ahead and store it separately. But if the file can't afford to lose this stuff you'd better make sure it's part of the data, not just the metadata. It'd better transfer intact when I send the file serially or copy it to and from a legacy filesystem.

--
Chelloveck
I give up on debugging. From now on, SIGSEGV is a feature.
Re:Obvious advantages by Tony-A · 2003-09-05 09:09 · Score: 1

By default, Apache discovers the MIME-type by looking at the file extension. That's right, it uses file extensions
And with very little configuration Apache uses different sets of file extensions in different contexts (folders).

Hack.
Precisely. Some hacks are better than others, but short of the metaphysical equivalent of the philosopher's stone, metinks that's all there is to be had.
Re:Obvious advantages by DonGar · 2003-09-05 09:28 · Score: 1

Other people are commenting on the fact that this isn't always true, and that it's usually based on file extentions on the other end.

And it's true that most applications only pay lip service to the MIME/Types.

However, the hardest part is already done. The MIME/Types ARE there, and not following them is generally considered 'the wrong thing', even if people don't get very excited about it.

That means that the infrastructure is there, just not polished. It's a LOT easier to convince people to polish up and properly support something they already have, than it is to add add new infrastructure.

--
plus-good, double-plus-good
Re:Obvious advantages by slycrel · 2003-09-05 12:11 · Score: 1

Unfortunately MacOS X takes a step backwards here and actually DOES interface with the extensions much more than it should. It's generally much better than the all or nothing approach windows takes, but still is lacking. There was a general outcry when OS X was first released about this sort of thing, but (per usual) it didn't change much of anything. Someone made a huge interesting critique/comparison, let me see if I can find the link....

Hmm. Well, here are a couple of OK ones, but not what I was looking for. Gotta jet though, so here they are:

http://siracusa.home.mindspring.com/john/articles/ proposal.html
http://www.latext.com/pm/comments/22_0_1_0_C/
Re:Obvious advantages by Tablizer · 2003-09-05 16:34 · Score: 1

The file should be self-describing. It should have a header saying its type.

But that would require every file writer and reader to follow a certain convention. I am not sure the burden belongs to such applications. If you can enforce such a convention, then you can enforce the alternatives just as well.

It is a matter of sticking with SOME convention, not so much where the meta info resides.

A "seperate area" for metadata is a separate area regardless of how or where it goes.

--
Table-ized A.I.
Re:Obvious advantages by wampus · 2003-09-05 17:34 · Score: 1

The placement is only physical because you are so used to thinking of it as such. Do you know where those bits are really sitting on the platter?
This is a completely different approach to data storage. Rather than having things layed out with set "locations" like this:

/ /home/ /home/user /home/user/music/mp3/ /home/user/music/mp3/some_band

you set the boundries on the fly:

Music by some_band:
Blah Blah by Some_Band (mp3)
Foo Bar by Some_Band (ogg vorbis)
Other Track by Some_Band (Windows Media)

Things I downloaded from gnutella last night:
Blah Blah by Some_Band (mp3)
Irritating Song by Flavor of the Week (mp3)
Star Wars Kid Video Number 1827 (mpeg 4)

Just because you divide it up one way doesn't mean you have to divide it up that way forever. There is no inherent grouping of files.

I'm sure that the soup has to stop somewhere, though. I MUCH prefer having my standard filesystem in case I need to recover the system. I don't want a database hickup to lose half the system libraries.

This type of system would also make it hard to use the command line. How do you set a path? Multiple files of the same name? I've gotten really fond of just popping up a terminal and using a for loop to batch things together.

I guess something like "for i in "music files i tagged" ; do decode $i ; done" would work, but it would seem that a GUI is really a neccessity here.
Re:Obvious advantages by fieldmethods · 2003-09-06 10:00 · Score: 1

And now some underinformed braindumping, feel free to rip me a new one, perhaps some light will shine in through the oriface:
Isn't it the case that distinct file types are pretty easy to distinguish automatically? After all, the "file" command works.
When you try to distinguish character sets, for instance, you build statistical models of the sets you wish to distinguish, and then compare the unknown text to those.
Couldn't the same be done to any file type? When the application writes the file, the "filesystem" (or whatever it should be called) stamps that new file with the model that it matched -- if it looked like an mpeg, it's an mpeg. It it can't figure it out, it tells you, you would have to take the step of telling it yourself.
But wouldn't that be better than specifying filetypes in the millions of tiny drop-down menus like we do now?
Re:Obvious advantages by laird · 2003-09-08 06:00 · Score: 1

While I agree with your pragmatic point that you should keep info that you need in the contents of files rather than just in metadata, because metadata-oblivious operating systems could lose it, it would be a shame if that issue kept operating systems from implementing richer metadata. Luckily, given that MS is adding 'rich metadata' to their next OS, perhaps we'll evolve past the limtations of the 70's... :-)

--
Enable 3D printed prosthetics!

Re:Mexican prostitutes by Anonymous Coward · 2003-09-05 01:01 · Score: 0

But can they solve this

oh wait... by Anonymous Coward · 2003-09-05 01:02 · Score: 0

... this sounds familiar. maybe newdocms ??

Re:Ahead of the game. by kubla2000 · 2003-09-05 01:02 · Score: 2, Insightful

Yeah, and as Longdong gets pushed back and delayed and delayed and pushed back and postponed and delayed, it'll be last to market but microsoft will still have been the first to announce it. I guess that's more innovative than they've been in the past when they'd simply wait for someone to do something interesting before buying them out.

It's not enough to say. One has to do. Microsoft has proved many times over that it often makes grand announcements only to provide something far more watered down by the time they get to market.

We'll see what they're DB-based file system really is when (and if) it gets here.

How does the metadata get into the database? by farnz · 2003-09-05 01:02 · Score: 5, Insightful

My major concern with all these database type filesystems is that the gains are always shown as things like, "Find all films directed by Steven Spielberg", and yet this is not information that the computer can necessarily gather for itself.

Outside of a work environment, I've rarely encounter anyone who keeps consistent, useful filenames, let alone metadata indexes; it seems to me that people will skimp on the metadata, and thus limit the usefulness to metadata that the computer can collect automatically ("All movies that last under 90 minutes"). It's like CD collections, or books; libraries have nicely catalogued and ordered collections. Private individuals don't; they have roughly ordered collections on the shelf, and don't bother keeping them in any better order. I suspect the same will happen with these metadata systems; people won't do the work needed to make them truely useful.

--
I appear to have a blog. Odd.

Re:How does the metadata get into the database? by Trigun · 2003-09-05 01:07 · Score: 1

then we just make certain that all the work is done up front. Sure you'll have some of the metadata stating that your files were 3nc0d3d by N3o, but if you don't like that, you can change it. White-market files will have the proper metadata.
Re:How does the metadata get into the database? by henbane · 2003-09-05 01:07 · Score: 4, Funny

"It's like CD collections, or books; libraries have nicely catalogued and ordered collections. Private individuals don't; they have roughly ordered collections on the shelf, and don't bother keeping them in any better order"
Call yourself a geek? How can you possibly but something on a shelf without first checking to see that it's in a proper place observing the subtle cross reference system that backs up the obvious system. Man, I hate it when people move my stuff.
Re:How does the metadata get into the database? by rusty0101 · 2003-09-05 01:16 · Score: 1

you mean you don't order the books by the color of their binding? How unstylish of you.

-Rusty

--
You never know...
Re:How does the metadata get into the database? by Tom · 2003-09-05 01:21 · Score: 4, Insightful

That's why we have community products. For music, CDDB works pretty good and is a working real-life example.

Other metadata is automatically inserted. When you install OpenOffice, it asks for your name and inserts that as the author into any new documents you create, for example.

Sure, the metadata on my personal machine will never be comparable to what a library could do. But it doesn't have to be - it has to be useful for me, not - like the library - for thousands of people with very different interests and approaches.

--
Assorted stuff I do sometimes: Lemuria.org
Re:How does the metadata get into the database? by neillewis · 2003-09-05 01:25 · Score: 1

My collection of MP3s never had any data beyond a vague [Artist - Title.mp3] file naming convention, but with ID tags, CDDB & iTunes decent gui to store and catalog them, I'm gradually getting there, and while I don't have to see the file I can always get to them if needs be. I shudder to recall some of the dodgy alternatives like RealJukebox. So my requirements would be:

A standard crossplatform way to transfer metadata.
(With XML becoming the default data format, this could be either held in a file or findable/queriable in a standard way from the source site/email/Server.)

An intelligent ui which stops maintenance becoming a chore.
Re:How does the metadata get into the database? by noselasd · 2003-09-05 01:25 · Score: 2, Insightful

Right. Individuals hate having their stuff messed with. So files are kept in a mess, but _you_ know where they are. Atleast untill you use Storage, which will mess them up.
Re:How does the metadata get into the database? by Phoenix_SEC · 2003-09-05 01:28 · Score: 1

Unfortunately, the real trick is to get application specific data.

When cataloging information, you need to have things specific to the type of file and/or arena it is used.

For instance, the catalog for a movie file would need to contain: Film Title, Actors/Actresses, Director, Production Companies, Rendering Companies (if application), FX Companies... While none of that applies to say, a word document.

I work for a digial storage company, and we run into the same issues. Our storage system keeps track of no (as in zero) metadata. Then, it is up to applications to track their own (e.g., storage system sits there, if you want to use medical images, there is a medical-specific database that tracks data for that type of image, video server has it's own database with it's information, they all relate back to the main server).

Well, late for a meeting, would elaborate more..

Phoenix
Re:How does the metadata get into the database? by zero_offset · 2003-09-05 01:30 · Score: 1

Outside of a work environment, I've rarely encounter anyone who keeps consistent, useful filenames, let alone metadata indexes
And if you end up in a large-enough work environment, the PHBs will implement fantastic new cost-saving processes like CMM and RUP which will then require you to use completely obtuse filenames like F0285913SDD.doc -- but wait, the obtuseness is standardized... oooo! aaahhh!

--
Slashdot quality declines as the number of hot grits posts decreases. - Provolt's Law, Apr-09-2005
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 01:35 · Score: 4, Insightful

There are two kinds of metadata: intrinsic and extrinsic. Intrinsic metadata, as the name would imply, is information that's contained entirely within the file.

Some intrinsic metadata can be extracted with automation. For example, it's pretty trivial to examine a TIFF and tell you that it's X by Y pixels in a given color space. It's harder, but still possible, to tell you that the TIFF is predominantly red and green. It's impossible for the computer to tell you that it's a picture of a barn.

The same is true of extrinsic metadata: some of it can be extracted automatically, but not much. An example of extrinsic metadata would be a relationship. The computer can tell you that main.c and somefunction.c are both C language source code files, but it may or may not be able to tell you that they're both part of the same program. If the two files are explicitly related to each other through a makefile or some such, then the computer can know that they're related. But consider the collection of JPEGs I just copied to my home directory from my digital camera. A dozen of those pictures were taken in Fiji. The computer cannot know this unless I tell it, nor can it know which pictures were taken in Fiji and which were taken in my back yard last Tuesday. Thanks to my camera, the computer can know what apeture and focal length were used for each picture. In theory, if my camera had a GPS receiver in it, the computer might even be able to tell me there, on earth, the camera was when it snapped a given photo. But these are just automatic methods of telling the computer about the pictures. They're conceptually no different from sitting down and typing the information in. The point remains that the computer can figure out some things on its own, but it cannot know most things unless it is told.

You don't have to strain your imagination to think about this stuff. Consider your MP3 collection. Your computer can tell you that a given MP3 is 6:15 long, and that it's 192 kbps, and that it's stereo. It can't know that it's "Treefingers" by Radiohead unless somebody tells it first.

So you're basically right: automatically extracted metadata is marginally useful, but the really useful stuff has to be manually entered. And generally speaking, even in business environments, that sort of information simply never makes it into the database. It exists exclusively in people's heads.

That's the biggest challenge of digital asset management--which is, incidentally, essentially what we're talking about here. The biggest challenge is how to take information that people have in their heads and store it in some structured, persistent form. That form might be a three-ring binder or a card catalog or an Oracle database; the challenge is the same even if the technology is different.

Bottom line: this technology is really neat, and has limited applications in which it's very useful. But it is not generally useful, nor does it have widespread applications.
Re:How does the metadata get into the database? by kfg · 2003-09-05 01:38 · Score: 1

"It's like CD collections, or books; libraries have nicely catalogued and ordered collections. Private individuals don't; they have roughly ordered collections on the shelf, and don't bother keeping them in any better order."

Private individuals don't keep collections of hundreds of thousands of books that at any given time could be in virtually anyone's hands.

Private individuals have a thousand or two books whith which they have a personal "relationship" and a database of what book can be found where in a data storage device called a "brain."

I like my brain. I'm told it's a very good brain ( which might color my opinion on this matter). I expect I'll keep using my brain for what my brain is best at and only rely on my computer for those things it does better than my brain. This might even serve to keep my brain running at something approaching optimum performance for longer than is considered the norm.

And to tell you the truth I really don't see this sort of thing being of any use to the sort of person who already just dumps every frickin' file they have on their desktop.

"Ok, now all you have to do is build your relationships."

Yeah, right Sparky. Blow me.

You either origanze your files or you don't and if you don't. . .you don't.

KFG
Re:How does the metadata get into the database? by squiggleslash · 2003-09-05 01:53 · Score: 1

I completely agree.
Right now the only advantage I can think of is that this system will enable the sneaking in of file metadata (not content metadata, I just mean "MIME-type, opener, printer, etc) in through the back-door, something that the GNOME, KDE, and Linux groups have been resisting with a passion.
What we could really do with is an indexing service that every program that creates files subscribes to. A fast, usable, "search engine" would be pretty handy and I can see how it would end up being used in preference to manually navigating through a hierarchy. A SQL based thing, OTOH, with an attempt to make every query 100% accurate, seems doomed to me.

--
You are not alone. This is not normal. None of this is normal.
Re:How does the metadata get into the database? by selderrr · 2003-09-05 01:54 · Score: 4, Insightful

Very true and insightful !

Another argument to prove you right is the "rate this song" options in iTunes. With that feature, one can assign 1 to 5 stars to a song so that later, you can quickly select your favourites. Such system is flawed in 2 ways :

- I have yet to encounter anyone who uses it exhaustively. Most folks rate a few dozen songs. I have a library with 9000 mp3s and sure as hell I'm not going to spend a whole week rating them.
- I have yet to encounter someone who uses it consistently. Today I might consider Chris Isaak a 4star song since I'm in a depressed mood and it's raining outside. Tomorrow the weather might be beautifull and I mod him down to 2 stars cause he's a bloody negative wanker.

This is offcourse iTunes specific, but it shows that the assignment of metadata is far far far more complex than the methods to search/organize the stuff, which is what the "Storage" software above is about.

As an extra complication : consider that my metadata might not match someone elses. For instance if I were to label a mail "message from my brother", the same content would be "message from my son" for my father !

The fact that metadata based filesystems are not on our desktop is perhaps more to the fact that it's not a valid solution for data on desktop computers. Maybe, just maybe this is not due to MS squashing Be, as someone else was karmawhoring above.

--
When will I end this grieving ? When will my future begin ?
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 02:09 · Score: 0

This functionality already exists in Windows 2000 and XP. Just turn on the Indexing Service.
Re:How does the metadata get into the database? by evilviper · 2003-09-05 02:27 · Score: 1

this is not information that the computer can necessarily gather for itself.

What's the big deal? Every time you save a file you type in a filename like "Lord of the Rings.divx", and with MP3s, most people either type-in an ID3 tag, or have it autofetched from the net.

Having a little field that says "Director" which you may or may not type into, isn't the end of the world. At the very least, you would only have to give it the filename, and it could find the details via imdb.org.

Outside of a work environment, I've rarely encounter anyone who keeps consistent, useful filenames, let alone metadata indexes

That's mainly because there's really not easy way to do it. It's a hell of a lot of work to type in a long filename, with details, while a "smart" system like this would make it much easier, and much les work. This metadata would be input more easilly than a filename, so there is a big difference. At the very least, they can just search via data or filetype and still find everything... It's not lost.

Private individuals don't; they have roughly ordered collections on the shelf, and don't bother keeping them in any better order.

That thing stitting in front of you is called a computer. It can be used to replace many tedious jobs that humans previously had to do. It's not out of the question for your computer to replace librarians for the most port.

As I mentioned, computers already look-up the data for CDs, and transfer that into MP3 ID3 tags without you having to think about it for a second.

people won't do the work needed to make them truely useful.

The great thing about them, is that it isn't nearly as much work as our current filesystems. After all, what would be the point of this if it was more work? It will be less work once it is working at all.

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Re:How does the metadata get into the database? by Nightpaw · 2003-09-05 02:30 · Score: 1

That is a very premium mode of arrangement.
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 02:35 · Score: 0

MP3s aren't problem, I can make simple Perl script which will find all songs made by artist X. But original poster was talking about movies...
Re:How does the metadata get into the database? by Chemicalscum · 2003-09-05 02:46 · Score: 1

"Consider your MP3 collection. Your computer can tell you that a given MP3 is 6:15 long, and that it's 192 kbps, and that it's stereo. It can't know that it's "Treefingers" by Radiohead unless somebody tells it first."
The MP3 Format contains metadata for artist and title. Next time you play an MP3 in XMMS look at the playlist - artist and title information is displayed there (even when there is no current connection to the internet).
I guess Storage accesses this metadata.
Re:How does the metadata get into the database? by ryanvm · 2003-09-05 02:54 · Score: 2, Insightful

Good points, but you're just talking about subjective metadata. The usefulness of which is certainly debatable. But what about factual metadata? Consider a downloaded movie that may have fields like: Year, Director, Rating, Genre, Studio, Cast, etc.

Granted the end user is not going to be likely to maintain this information, but that doesn't really matter. The end user is also not likely the source of the material in the first place. I contend that metadata is more useful for material that the user has downloaded or purchased. That data SHOULD have accurate metadata and could be extremely useful.

Consider the following meta-search: ALL MOVIE FILES WITH YEAR>2007 AND RATING>X AND GENRE=SILENT AND CAST='BRITNEY SPEARS'
Re:How does the metadata get into the database? by farnz · 2003-09-05 02:58 · Score: 1

Except that most people don't type in enough information for the computer to uniquely identify a given piece of information. Case in point: what's the appropriate metadata for "Harry Potter.txt"? Is it a review of a movie, the text of one of the books, some fanfic, a letter to a friend? Or to take your movie example ("just look it up at imdb.org"), what's "Harry Potter.avi"?
Yes, you don't have to fill in all the information, but unless you bother filling it in, the utility of a rich metadata filesystem is greatly reduced, since a lot of the fancy searches return incomplete result sets.
And I doubt they'll be less work than our current filesystems; someone has to get the metadata in there, and none of the projects I've seen address this issue. They're all about making sure of the metadata that's been put in there. Entering it in the first place is usually a matter of filling in a lot of boxes, which I doubt people will bother with (I certainly wouldn't).

--
I appear to have a blog. Odd.
Re:How does the metadata get into the database? by selderrr · 2003-09-05 03:20 · Score: 1

you're right. For movies, such a FileSystem would be cool. In any case better than what we have now. The same holds for music offcourse. But that's why i have iTunes.... I can imagine similar software (iFlix ?) for movie archiving exists or will be released in the future...

IMHO, for music & movies (and probably more) you need a special application to view your content. In which case it seems far more logical to incorporate the metadata stuff in that app, rather than in the FS where it would become a redundant weight that will hardly be used.

How many other categories can u come up with ? If I quickly inspect my own homedir, I have a bunch of mp3s (viewed in iTunes) 5 or 6 iMovie made videos and my iPhoto library (which has metadata support that I can't force myself to use simply because I don't want to spend the time) and my mail. All these are categorized by specific apps
All the rest are mainly .doc, .xls, .cpp files that don't need categorisation.

Honestly : I see little advantages in a specialised storage system for those 50-100 files that I won't categorize anyway, since I can find them visually in my 3-level folder hierarchy.

--
When will I end this grieving ? When will my future begin ?
Re:How does the metadata get into the database? by sporktoast · 2003-09-05 03:22 · Score: 1

you mean you don't order the books by the color of their binding? How unstylish of you.
Organizing by color works better with CDs than books. Their uniform size makes for a crisp aesthetic, as well as a more efficient use of space. It was very striking and never failed to impress friends.
With books, the wide variations in sizes contributed to a cluttered look that almost completely countered the chromatic harmony. It also wastes a lot of space in accomodating the few large books on each shelf.
We kept our CDs that way for almost 4 years, but my wife eventually nixed it for being too difficult.
-Sporktoast

--
In a related story, the IRS has recently ruled that the cost of Windows upgrades can NOT be deducted as a gambling loss.
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 03:26 · Score: 0

Yes, but you're completely missing the point. That metadata had to come from somewhere. There's no metadata fairy. Somebody had to type it in at some point. And in most cases, it's not worth the effort of typing in all the metadata. CDDB is the exception that proves the rule. Nobody wanted to type in all their own metadata, so they distributed the job among untold millions. So now pretty much every CD is "CDDB'd." If you'd had to type in all the metadata for your own CD collection, you never would have bothered. (Unless you're obsessive or something; most people aren't.)
Re:How does the metadata get into the database? by Nodatadj · 2003-09-05 03:32 · Score: 1

I always get the feeling that no-one uses metadata systems such as the iTunes star rating because really, there's not much advantage to spending the time to do it.

But look at the assignment of Track/Album titles, most people that I know who use iTunes use it constantly, even filling in the details when they're not online or if the CD they've just ripped is unknown.
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 03:35 · Score: 0

Windows Media Player rates songs automatically and builds playlists based on your listening patterns. Of course you will ignore this usefull feature because Microsoft invented it and when someone adds this feature to a Linux app it will get headlines on slashdot as an innovative new feature.
Re:How does the metadata get into the database? by ReallyBigNumber · 2003-09-05 03:41 · Score: 1

Call yourself a geek? You misspelled "anal".
Re:How does the metadata get into the database? by alext · 2003-09-05 03:43 · Score: 1

this is not information that the computer can necessarily gather for itself

Not necessarily, but system's like Autonomy's automatic categorization search can do a pretty good job.

I see several posts here point out the difficulty of maintaining category information (erroneously referred to as metadata) manually so it seems clear that progress in automatic classification is needed to complement more sophisticated storage structures.
Re:How does the metadata get into the database? by Christopher+B.+Brown · 2003-09-05 03:45 · Score: 1

It means that you have to have some (likely heuristic) process that rummages around for metadata.
For instance, those MP3 files are quite likely to have ID3 tags assigned to them, indicating the artist, album, track number, and possibly even the name of the song.
It is easy enough to look at a file using /usr/bin/file and say: That looks like an MP3 file! Why don't I extract ID3 tags by running id3 -l on it?
There is quite a lot of information that can be gotten by running stat and file , and if you have further actions to take for some set of document formats (MP3, OggVorbis, PNG, GIF, JPeg, MS Office formats, OpenOffice.org formats, HTML, and some others), you can probably get a significant amount of useful metadata in a totally automated manner.

--
If you're not part of the solution, you're part of the precipitate.
Re:How does the metadata get into the database? by hoggoth · 2003-09-05 03:48 · Score: 2, Funny

> Organizing by color works better with CDs than books

Geez, do you have to count how many times you wash your hands? Do you make sure your right turns always equal your left turns?

Try this:
Leave a big cardboard box on the floor. Dump all your CDs into it. When you want to listen to music, fish around and enjoy the surprise at what you find.
Enjoy the extra free time you have now that you are not obsessively organizing your collection.

--
- For the complete works of Shakespeare: cat /dev/random (may take some time)
Re:How does the metadata get into the database? by evilviper · 2003-09-05 03:57 · Score: 1

And I doubt they'll be less work than our current filesystems

See my other post on how much hassle current filesystems are: http://developers.slashdot.org/comments.pl?sid=773 33&cid=6878856

Frankly, I don't think it would be possible for anything to be MORE work than current filesystems.

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 04:10 · Score: 0

I've got to try that. I always look for them by color, so why not sort them that way.
Re:How does the metadata get into the database? by chigaze · 2003-09-05 04:43 · Score: 1

Just to be the exception, I use the iTunes rating system extensively. When I have time I rate the songs as they are ripped otherwise I have a smart playlist that return only unrated songs. I also have playlists that return only songs rated 5, only songs rated over 4, only Blues songs over 4, etc.. When I'm listening to a particular mix if I feel the rating of a song needs to be adjusted I adjust it on the spot.

The key to all this is an app called Synergy that allows me to map hotkeys for rating the currently playing songs, among other things. The ratings are applied as I listen and require no extra time on my part to apply.

I think this would be the key to any metadata system: it has to be fairly transparent and has to be useful. I like listening to mixes and I've utilized the metadata in iTunes to facilitate that.
Re:How does the metadata get into the database? by gr8_phk · 2003-09-05 05:03 · Score: 2, Interesting

"I've rarely encounter anyone who keeps consistent, useful filenames, let alone metadata indexes".
If it allowed natural language interaction with the machine, people might just provide more information. Since it begs for a voice interface, why not have the machine ask a few questions about a document while you're editing/viewing it? When a new file comes in via email with no metadata, the machine says "what's this all about?". You'll naturally describe it using words similar to those you'll use to retreive it later. Sounds fantastic if all this can really be made to work.
Re:How does the metadata get into the database? by vonFinkelstien · 2003-09-05 05:10 · Score: 1

The song rating is only one of the many meta-data tags iTunes uses. Genre is much more useful that little stars for creating smart playlists.
For example, I have a play list that looks for folk songs that aren't Swedish, Czech, Hungarian (i.e., American roots music for my library).
I have another that has all Jazz under a certain length (to exclude whole albums from my pirating days), excluding vocals and big band.
I spent a week entering meta-data into iTunes, because I find it very useful. I also make sure that CDDB has good entries for my CDs. If they don't I spend the time to enter all the data and submit it to CDDB.
Re:How does the metadata get into the database? by DukeyToo · 2003-09-05 05:16 · Score: 1

Metadata is all related, and that is where the potential is. If your PC knew:
1) That you travelled to Fiji last week
2) You uploaded some pictures taken last week

...then it is reasonable to assume that you took the pictures in Fiji. It is not a certainty; perhaps you left your camera with your brother to take pictures of his cat.

So to be of optimal use, there should be some AI at work to derive the obvious (although sometime incorrect), thus making the best use of what metadata there is.

I do not know that a db filesystem really helps in this regard, except that perhaps it will allow metadata and data to be better structured, which is a step in the right direction.

--
Most writers regard truth as their most valuable possession, and therefore are most economical in its use - Mark Twain
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 05:31 · Score: 0

Indeed. The factual metadata on the file, sent by selderr, sent by selderr's father, must be related the your factual metadata, I am selderr, this is my father. Then it can be interpreted and displayed by the software as appropriate for the individual or group looking at it.

Ideally, the software itself would devise that due to the fact that your emails contain dad, son, talk about family and you both have the same last name; that you are related, but that is a little ways off.

As for getting the factual metadata in the first place, isn't that what the internet is best at? Why not write a Google interface for your filing system? It could go out and get the information to fill in the factual metadata by using the vast cache of knowledge.

You know, that's a good idea.

1. Use Google
2. ???
3. Profit!

Anonymous Cowards of the world rejoice! We will be rich!
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 05:35 · Score: 0

there should be some AI at work

You know what I think? I think there should be a minimum IQ requirement to post on Slashdot.

I don't give a flying fuck about artificial intelligence. I'm looking for some real intelligence.
Re:How does the metadata get into the database? by pmz · 2003-09-05 06:14 · Score: 1

I suspect the same will happen with these metadata systems; people won't do the work needed to make them truely useful.

Neither will small businesses, and big businesses will go overboard, where employees spend more time managing metadata than real data.

For example, who uses Word file templates whose "author" is some middle manager who quit five years ago? Metadata often becomes stale almost immediately, unfortunately.

Perhaps if we had a single global database of every man, woman, and child with their life history encoded into a digital format, we could link into a single primary key that will never go stale. Oh, the convienience is very much worth the orwellian information dictatorship that ensues stripping every man, woman, and child of their humanity. Yippee!

--
Healthcare article at Kuro5hin
Re:How does the metadata get into the database? by pmz · 2003-09-05 06:19 · Score: 1

...it has to be useful for me, not - like the library - for thousands of people with very different interests and approaches.

Based on the recent articles about how MS Word metadata and undo data lives on in future revisions of the file, your metadata might just be very valuable (and revealing) to future employers, salespeople, etc.

This is why I always give Windows and Word a false identity for their metadata fields.

--
Healthcare article at Kuro5hin
Re:How does the metadata get into the database? by selderrr · 2003-09-05 06:36 · Score: 1

umpf... okay, I'll bite

Windows Media Player rates songs automatically and builds playlists based on your listening patterns
dude, have u used it ????????
I tried it several times, and god-knows-why ,but the ratings WMP was assigning were total bollocks. One of the first indications the system is flawed, is that it had ratings of songs before I had played them (don't asky me how or why. I never figured it out either). Second, it seems fucks up regularly with multple versions of the same song (I used to be a PinkFoyd fanatic, and have 14 bootleg versions of 'CwtAE'. WMP seems incapable of distinguishing them). And finally Media Player is a freaking shit app. I can not play a movie while a CD is running. I could live with it if it would simply pauze the CD, but no : the CD window closes and the movie starts. After that, it's up to me to remember what song the CD was at. Sorry pal, but a system that can't even do that does not get my confidence in rating song quality.

--
When will I end this grieving ? When will my future begin ?
Re:How does the metadata get into the database? by T-Ranger · 2003-09-05 06:53 · Score: 1

Technical manuals have a long and proud history of being in weird, bright colours.
So in the geek world, ordered by colour isnt necessaraly all that bad.
Re:How does the metadata get into the database? by Effexor · 2003-09-05 07:40 · Score: 1

First off, metadata is not usefull at all if there is no where to store the metadata in the first place.
A large amount of metadata could be assigned by the application. When was it created, what user created it, what application, who sent it to you... and other data would actually be supplied by the users, even ones who might not be very computer savvy. I am willing to bet that you name your files. You type in a subject for emails. Your 9000 mp3's are most likely arranged in some form of directory structure which is you the user classifying your files.
User metadata could be as simple as typing in a subject field for a spreadsheet, or dragging an unclassified text document into the 'Really Important Files For Work' virtual folder.
In fact with virtual folders representing stored queries, it would not be all that different as far as most users are concerned, except that with more of the logic being done for them, they might be surprised to find that their files really are where they look for them.
The main reason it isn't on your desktop is not because it wouldn't have advantages, or could not be implemented with lazy or stupid users. It is that it would require a lot of changes from the ground up.
And why do you want your brothers mail to you to say its from your uncle?

--
As the air to a bird or the sea to a fish, so is contempt to the contemptible -W.B.
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 07:45 · Score: 0

> Outside of a work environment, I've rarely encounter anyone who keeps consistent, useful filenames, let alone metadata indexes

This is why you don't build them if you don't have to. When was the last time you had to enter the track information on a CD you popped into your drive? Usually it's in cddb. Nobody wants to give convenience to the RIAA's legal arm, so it's rarely done for mp3's. Amazon web services for your books, etc etc...
Re:How does the metadata get into the database? by rgmoore · 2003-09-05 08:02 · Score: 1

At the very least, you would only have to give it the filename, and it could find the details via imdb.org.

And, in fact, the system is capable of doing exactly this, which is mentioned in this article about the system:
For example, importing a movie into the information store breaks the movie's internal metadata apart to determine the date, author, and title, as well as the type, length, width, height, etc. This information is then used to leverage additional information from the Internet Movie Database (imdb.org) about the director, actors, etc.

It seems pretty obvious that you could do the same general kind of thing for any type of file for which there's likely to be a searchable record on the internet. That actually covers a lot of possibilities; you just need to know who's keeping the information and how to get at it.

--
There's no point in questioning authority if you aren't going to listen to the answers.
Re:How does the metadata get into the database? by leandrod · 2003-09-05 08:03 · Score: 1

> A SQL based thing, OTOH, with an attempt to make every query 100% accurate, seems doomed to me.

How can a query be useful if it is not accurate?
How SQL is worse than anything else, save a relational system?

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:How does the metadata get into the database? by alext · 2003-09-05 08:13 · Score: 1

I'll pass quickly over your dubious terminology and get straight to repudiating your conclusion:

There are a number of successful products that automatically infer ("extract") categorization information ("meta-data") from unstructured data and are certainly more than "marginally useful".

A structured information storage system for Linux is something to be welcomed rather than sneered at. It will immediately improve the accessibility and coherence of information now haphazardly stored in dozens of different semi-structured forms, and, later, in conjunction with the kind of automated tools referred to above, has the prospect of growing into a valuable information processing system.
Re:How does the metadata get into the database? by squiggleslash · 2003-09-05 08:17 · Score: 1

Is Google accurate?
The problem with focussing on an accurate database is that it takes huge amounts of maintenance, and cannot be done automatically. A system that just has to be "good enough" and is entirely automatic has clear advantages over a system that is perfect but ultimately discourages use.

--
You are not alone. This is not normal. None of this is normal.
Re:How does the metadata get into the database? by Alan · 2003-09-05 08:21 · Score: 1

Good ideas, but in these days of spammers googlebombing and whatnot I'd rather not have the directors or titles of my movies be "wild and crazy romps with horny bitches at www.foobarjizz.com" :)
Re:How does the metadata get into the database? by leandrod · 2003-09-05 08:31 · Score: 1

> This functionality already exists in Windows 2000 and XP. Just turn on the Indexing Service.

Indexing is already a standard feature. POSIX systems have locate, Gnome has Medusa, the Mac OS indexing is ancestral...
Using a quasi-relational system is about enabling more powerful interaction with the data, not only indexes.

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:How does the metadata get into the database? by Anonymous Coward · 2003-09-05 09:32 · Score: 0

As an extra complication : consider that my metadata might not match someone elses. For instance if I were to label a mail "message from my brother", the same content would be "message from my son" for my father !

Yes, but it's still the same person. The metadata stored would be "from: Bob" (lets say your brother is Bob). Elsewhere (in an address book database, presumably) you specify that Bob is your brother. If you search for "message from my brother", it resolves to "message from Bob". Likewise, your father's query for "message from my son" resolves into "message from Bob OR selderrr".
Re:How does the metadata get into the database? by dr.badass · 2003-09-05 09:43 · Score: 1

Not to poop on Synergy or anything, but it's really not necessary. You can click and hold on iTunes' icon to get it's dock menu, which allows you to change the rating. It takes about a second and doesn't take you out of the current app. Unless you absolutely can't take your hands of the keyboard, it's just as unobtrusive, and you save yourself $5.

--
Don't become a regular here -- you will become retarded.
Re:How does the metadata get into the database? by leandrod · 2003-09-05 10:09 · Score: 1

> Is Google accurate?

Yes, within its constraints.
In databases we have the Closed World Assumption. The database is a logical system full of predicates; it thus represents a set of assertions. Whatever is there, is assumed to be true; whatever isn't, is assumed to be false
> The problem with focussing on an accurate database is that it takes huge amounts of maintenance, and cannot be done automatically.

Yet metadata culled from automatic processes -- naming of Office documents, tags in audio, text and video files, keywords from text files -- is usually good enough. What SQL is bringing is a little bit more structure, and thus power, depending on the data model implemented.
So what's the problem, exactly? How SQL, or Storage, would "discourage use"? I'd much rather give five or so keywords to a new document than find its place in the hierarchy.

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:How does the metadata get into the database? by Simon · 2003-09-05 10:10 · Score: 1

...and most so. There are truckloads of useful metadata out there that the computer could capture. For example:
* I get an email with a file from joe. I save the file to disk. My email program could associate the saved file with the email message and thread that now I could ask the computer for 'that file from joe', or 'that file from joe@citizen.org'. Keywords could also be taken from the email and associated with the file too.
* If I'm working on some files (say spreadsheets for example) and I cut and paste bits of info between the two files. The computer could easily infer some kind of relation or loose association between the two files.
* I grab a file from the web and save it. The computer could easily associate the file with where it came from and the page it is from. Hell, the computer could even check back at the web site and tell me if the original has been updated.
--
Simon
Re:How does the metadata get into the database? by alext · 2003-09-05 10:14 · Score: 1

Yep, this is what the Autonomy stuff does - looks over your shoulder in effect and records the links in what you do including exactly the email scenario you describe.

Actually, the most impressive demo I saw was of a CNN broadcast being fed in via a voice recognition system, and the displayed TV picture being annotated live with links to relevant sources of info.

I'm sure there are competing systems out there but theirs is the only one I know.
Re:How does the metadata get into the database? by squiggleslash · 2003-09-05 10:44 · Score: 1

Google isn't accurate in the kind of terms we're talking about here. I can do a search on a keyword and find websites that have little or nothing to do with that keyword. It's "good enough", not "accurate".
Your last sentence kind of shows you're approaching this from the wrong angle: I've said an inaccurate system is preferable to an accurate but manually maintained one. A hierarchical system is an example of the latter, not the former. Unfortunately, an table based system is ALSO an example of the latter, not the former.
A system where navigating your files means being presented by a Google-style search box strikes me as something genuinely useful and something with the potential to be an improvement on what we have now. But a system that has a "search box" but requires users set up keywords for every single object they store just will not cut it. I have difficulty enough categorising files and putting them in nice, neat, little directories, despite having a dozen ways of doing it and despite two decades (1983 Apple Lisa) to present of sustained, focussed, user interface development, and even longer of trial and error. How is a table based system (SQL is a misnomer, it's merely an interface, it isn't the mechanism) going to help?
My belief is that a relational database, while flexible, would suffer from the same faults as hierarchical systems when managing file systems. Better solutions lie elsewhere.

--
You are not alone. This is not normal. None of this is normal.
Re:How does the metadata get into the database? by leandrod · 2003-09-05 11:33 · Score: 1

>

Google isn't accurate in the kind of terms we're talking about here.

Then I fear we are not communicating... what are your terms? For me, Google is a database, and it gives (or should give) accurate answers according to its contents.

>

I can do a search on a keyword and find websites that have little or nothing to do with that keyword.

Because that keyword is related to those websites in Google's database...

>

It's "good enough", not "accurate".

So what would be accurate? Actually, if what you want is "a perfect answer to my query", that doesn't exist by no fault of Google or its database, but simply because no such answer exists... probably even you query is an approximation only of what you really need, and there's no guarantee a correct answer exists and is attainable.
You have to define your scope before you can talk meaningfully.

>

I've said an inaccurate system is preferable to an accurate but manually maintained one. A hierarchical system is an example of the latter, not the former. Unfortunately, an table based system is ALSO an example of the latter, not the former.

So what's the problem? None I can see... an SQL system can be more automated than a hierarchy, while at the same time enabling richer manual data entry if desired and much richer user interaction.
In fact, while a hierarchy has to be completely manually maintained, with automatic indexes only pointing to specific nodes in it, a database can combine automatic metadata and indexes with manually maintained, but optional, metada.

>

But a system that has a "search box" but requires users set up keywords for every single object they store just will not cut it.

No such thing needed. You already have to place things in a hierarchy and give them name. Now suppose one really wants to keep the user interface for file creation stable. No problem, just present a hierachy to the user, and allow him to give names as before. Only, capture all this information in a relational (or SQL) database -- the fs interface was maintained for the user, but with the addition of richer querying capabilities. No harm done, much gained.

>

I have difficulty enough categorising files and putting them in nice, neat, little directories, despite having a dozen ways of doing it and despite two decades (1983 Apple Lisa) to present of sustained, focussed, user interface development, and even longer of trial and error. How is a table based system (SQL is a misnomer, it's merely an interface, it isn't the mechanism) going to help?

Simple, by not needing a hierarchy to categorise. For example, I maintain some categories in DMoz, including -- guess it -- Computers/Software/Databases/Relational.
Now in a relational system, I could say simply Relational -- all the rest is implied, and unnecessary. I could still put it in a specific place in the hierarchy, but this would be merely a presentation gimmick, without imposing storage constraints.
Now suppose Relational also has a Psicology meaning, as it probably has. No problem, Informatics and Relational would be specific enough for me, and still less trouble than going over the hierarchy.
In practice, the URL string itself, the title of the referred page, the URL name and the explanation associated with it all would be automatically part of the database, so that a query like Relational and Manifesto would give me even more specific resources, without neither requiring nor ruling out subdirectories navigation.
You have to understand relations and hierarchies aren't realy comparable... relations are richer; they can store hierarchies, and so enable everything hierarchies do, without their ass

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:How does the metadata get into the database? by squiggleslash · 2003-09-05 12:39 · Score: 1

Then I fear we are not communicating... what are your terms? For me, Google is a database, and it gives (or should give) accurate answers according to its contents.
Which would make the word "accuracy" redundant if we were using it in these terms. Every program responds with the answer it has been programmed to give, even programs with bugs in them. Unless you decide that words like "bugs", "mistakes", "faults", etc, are meaningless, you can't also suggest that "accurate" is defined in terms of the logic used to find the answer.
In this case, we're looking at trying to use Google as a method to find web pages. Can you find a web page via Google, even restricting it to the web pages Google is aware of? Answer: Usually.
That's what makes Google useful. Would you suggest for one moment though that the results it gives are accurate within the context of what users are searching for? Of course not: Most searches will produce hundreds of irrelevent hits. Very often, the web page you're looking for will not be hit, because you've had to use English - where a variety of words and phrases are used to describe the same things - to get your answer.
That said, we're going down a somewhat unnecessary path. My point was that a system like Google, despite being inaccurate, is more useful than a table/SQL based system. Now, let's look at the terminology you're using:

So what's the problem? None I can see... an SQL system can be more automated than a hierarchy, while at the same time enabling richer manual data entry if desired and much richer user interaction.
No, it can't. Not if we're talking in meaningful terms. I mean, sure, one can create a system that automatically looks for words and indexes them, that happens to use SQL, but it's a stretch to say that that's a feature of the mechanism. You might just as well say that a BSD DBM system can replace a hierarchy.
Which it can. Because there's nothing stopping anyone writing some autoindexer that inserts words in nice indexed DBM files. But that's not what springs to mind when someone talks about a DBM based file system. And likewise, a Google search isn't what people are refering to when they're talking about a SQL based file system.
There are several philosophies here. There's in the ingrained system, which is based on hierarchies. There's the structured (keyword/value at a minimum, with more complex structures possible) system, which is based on relational database tables, and there's the unstructured system, which is based, for want of a better word, on Google. How any of these three are stored is irrelevent. The relational database tables that make up a structured system could be plain ASCII files. They could be DBM files. They could be in PostgreSQL, MySQL, or even Oracle. They could be in Excel spreadsheets. The format isn't the issue. Likewise your file system hierarchy could also be in some "Excel spreadsheet", sitting as inode 0 on your root. And don't get me started on the Google system.
What "Storage" appears to be talking about, and what form it could only be meaningful in, is the structured form. I believe that the structured form is doomed. I believe it's doomed because it requires that file objects be categorised, and be categorised largely manually. A computer simply isn't going to know what .MOV files contain films by Spielberg.
We can bypass this a little by trying to make use of shared meta data sources where possible, but so far the best example of such a thing, the CDDB, isn't exactly an advert for cooperative data sharing. (I rip CDs all the time, and find I have to clean up about 70-80% of the entries - artists put in composer slots and vice versa, song titles in artists slots, CD labels in the song names.)
We can also try to bypass the user aspect and just come up with certain useful flags, but there's a limit to what a computer can do, and inaccuracies, again, are inevitable.
That's wha

--
You are not alone. This is not normal. None of this is normal.
Re:How does the metadata get into the database? by Shadowlore · 2003-09-05 19:28 · Score: 1

"It's like CD collections, or books; libraries have nicely catalogued and ordered collections. Private individuals don't; they have roughly ordered collections on the shelf, and don't bother keeping them in any better order"

Call yourself a geek? How can you possibly but something on a shelf without first checking to see that it's in a proper place observing the subtle cross reference system that backs up the obvious system. Man, I hate it when people move my stuff.

Actually, this points out the problem with Heirchical FS. I have songs that go unde rmultiple categories, just as I do for CDs and DVDs.

For example, I can not right now have a "Comedy" section, "Action" Seciton, and then a "Bill Engval" or "Arnold Schwarzenegger" section and have them all consistent. Where does "True Lies" go? Well, it goes in two categories. But I'm not about to buy multiple copies just for organizations. It works for books too.

Sure, you can try symliks, but as someone who has gone down that route for my songs (Sort by Artist, Genre? What about cross-genre artists or multi-artis songs). It gets *real* cumbersome and you wind up writing a set of scripts or programs to try to organize the whole mess.

FIlesystems are the same. "Documents", eh? Well I've got documents that fall under multiple categories, such as Politics, Tech, Programming, Household, Corvette Stuff, etc.. Sorting by folders leads to the same problem.

If done well, this type of system can be very useful for storing *data*. I can then do things like "Views" in SQL-Speak to have a "table" that shows all "Corvette" related documents, regardless of physical location on my drives, or "Political Essays" that may fall under various topics.

That said, I wind up changing my book/cd/dvd collection organizing frequently since I add new items and the whole scheme changes frequently.

Best of luck to this project, I'll be watching it!

--
My Suburban burns less gasoline than your Prius.
Re:How does the metadata get into the database? by leandrod · 2003-09-06 01:38 · Score: 1

>

a system like Google, despite being inaccurate, is more useful than a table/SQL based system.

Now, Google *is* a database! What is the difference?! It may not use tables and SQL, but that's pretty much immaterial... SQL has its limitations, but something like Google can and should be implemented over a relational system.

>

one can create a system that automatically looks for words and indexes them, that happens to use SQL, but it's a stretch to say that that's a feature of the mechanism.

But I never said that. What I am trying to say is that while SQL has its limitations, a relational system can be used to index content in a much more powerful and flexible way than simply indexes-over-hierarchy we have today, while preserving the legacy interfaces. This power is a relational feature.

>

You might just as well say that a BSD DBM system can replace a hierarchy.

> Which it can. Because there's nothing stopping anyone writing some autoindexer that inserts words in nice indexed DBM files. But that's not what springs to mind when someone talks about a DBM based file system.

You are messing the physical and logical levels. Relational systems, and SQL to a measure, present a logical interface. The back-end storage can be DBM, a graph, ISAM files, even SQL itself; but the real juice is the logical interface. It is on the logical interface that human interfaces are built; in this case, Storage can present a Gnome Google-like search interface to a SQL database, in parallel with a legacy filesystem interface, at the same time as it can allow richer queries.

>

a Google search isn't what people are refering to when they're talking about a SQL based file system.

A Google-like search isn't excluded by SQL, why it should be?

>

There's the structured (keyword/value at a minimum, with more complex structures possible) system, which is based on relational database tables, and there's the unstructured system, which is based, for want of a better word, on Google.

You are definetly mixing levels. To repeat, Google is but an interface. It does have structured data, as in file types, domains, language etc associated with each page. There is no reason it can't be stored either relationally or in SQL, preferrably relationally.

>

The relational database tables that make up a structured system could be plain ASCII files. They could be DBM files. They could be in PostgreSQL, MySQL, or even Oracle. They could be in Excel spreadsheets.

SQL is not relational. Tables are not relations. MySQL is not SQL, even Oracle is not proper SQL.

>

I believe it's doomed because it requires that file objects be categorised, and be categorised largely manually. A computer simply isn't going to know what .MOV files contain films by Spielberg.

This is stupid. Today a user has to save a file with a name in a place in a hierarchy. It is even easier for a system to suggest the input of a few fields according to the type of the file... not to mention several file types carry their own info.

>

Yes, on a technical level, I *can*, despite your comments, find ways to cleanly hierarchically categorise my files. The fact that I don't isn't because it's a hierarchy, it's because it's a huge amount of effort even given the best tools to do the job.

Tools haven't nothing to do with it, but the fact that different users, and even the same user at different times and moods, want to put the same info at different nodes, or several ones at the same time. It is simply a pain.

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:How does the metadata get into the database? by jbolden · 2003-09-06 03:53 · Score: 1

GNOME, KDE, and Linux groups have been resisting with a passion.

Could you give some more information (like links) where I can read these debates?
Re:How does the metadata get into the database? by Haeleth · 2003-09-06 04:30 · Score: 1

I'm not really sure I want all my incoming emails to be categorised under "shut the fsck up, you stupid machine"...
Re:How does the metadata get into the database? by chigaze · 2003-09-06 11:28 · Score: 1

I know about applying changes through the dock and I find I prefer Synergy. I do use it for more than just ratings, Synergy has hotkeys for almost every aspect of iTunes. I can adjust volume, skip tracks, popup track information, etc. in a fraction of the time it would take to go to the dock or bring up iTunes manually, and the action doesn't really interrupt my workflow.

Re:Ahead of the game. by Anonymous Coward · 2003-09-05 01:03 · Score: 0

Well, and BeFS DID it ... seven years ago ?

The Trend by TwistedSquare · 2003-09-05 01:03 · Score: 1

This certainly seems to be the trend in filesystems these days, this must be at least the third slashdot posting about a database filing system I've seen recently. Does anyone have any information on how reliable they are (which I'd imagine would be the major concern about such file systems)? I'm guessing they will not replace ext3 etc., merely be used where applicable.

ext3 + sql by Dreadlord · 2003-09-05 01:04 · Score: 2, Interesting

I don't know how a database system can improve a file system's performance, especially with the unnecessary overhead associated with, the current state of the ext3 file system is doing quite well, and updatedb/locate works fine for me.
What can really interest me is something like updatedb/locate but with SQL syntax support, this could be awesome.

--
The IT section color scheme sucks.

Re:ext3 + sql by rtaylor · 2003-09-05 01:45 · Score: 4, Informative

It won't improve performance if you know exactly what you are looking for. The goal is to improve performance when you only have a vague idea of what you want.

This isn't a place to store config files or cronned shell scripts which have definitive locations and content.

This is a replacement for that 5TB corporate filestore with a 50 directory hierarchy that nobody can figure out, and a content based find takes days to complete.

--
Rod Taylor
Re:ext3 + sql by Anonymous Coward · 2003-09-05 02:11 · Score: 0

It won't improve performance if you know exactly what you are looking for. The goal is to improve performance when you only have a vague idea of what you want.

Nice. Something like this?
~$ I wanna hear an mp3 or ogg with some metal rock. Mike Patton's stuff or something like it, but noisier than Tomahawk. open 01_the_dillinger_escape_plan-when_good_dogs_do_bad .mp3 in xmms? (yes/no)
Re:ext3 + sql by stephenbooth · 2003-09-05 02:47 · Score: 2, Insightful

It also solves the problems of if I have a letter to Smith and Sons (Builders) about a bridge construction project do I store it under letters/Smith&Sons/Bridge or Smith&Sons/letters/bridge or bridge/Smith&Sons/letters or projects/builders/bridge/smith&sons/letters or what. With a database based file system you store it once and tag it as a letter, to Smith and Sons (Builders) about the bridge construction project (and any other tags you would like to apply to it) and then when you want to find it again you can just go through which ever access path you like. Also you can then find all letters, all documents relating to Bridge construction, all documents relating to Smith&Sons &*c.

Last time I posted about something like this a coupel of people posted responses stipulating an 'ideal' directory path. To preclude this I must point out that whatever method of classification makes most sense to you and the tasks you do won't make sense to someone else and the tasks they do. For example to a project manager it makes sense to classify documents by the projects they are part of then probably by the document type as they are only interested in the projects they manage and don't want to have to navigate through multiple branches of a directory tree to locate all of the files for their projects (if they are working ont he brige construction project they want all the documents to be under a directroy called 'bridge_construction' but for a finacial director it makes sense to classify the documents by type they are as they are only interested in financial documants and don't want to have to navigate through multiple branches of a directory tree to find all of the financial documents (if they are looking for all invoices from Smith and Sons they want to find them all under invoices/smith&sons). A database based storage system allows everyone to view the data int he way that makes most sense to them and the way that they work.

Stephen

--
"Don't write down to your readers, the only people less intelligent than you can't read" - Sign on Newspaper Office Wall
Re:ext3 + sql by cens0r · 2003-09-05 05:20 · Score: 1

Mod this up please. I never have mod points when I need them.

--
Jack Valenti and Orrin Hatch will be first up against the wall when the revolution comes.
Re:ext3 + sql by curious.corn · 2003-09-05 06:38 · Score: 1

But it would rid us of library hell, missing symlinks, library path? Imagine having access to all libs by name, version num and dependencies without fiddling with multiple paths and environment variables. Of course an expert coder can ls grep sed ldd /usr/lib or the rpm db to have something similar but hey SELECT * FROM volume WHERE type='lib' AND dependancy='common_lib' looks cute to me. It's actually conceptually more sound rather than piping some command's string output to fish out the info. Writing an app to build lib hierarchy trees would become simple as hello world. What would become of rpm? A tar -xzvf package.tar.gz as root would pour the datafiles into the disk and update the relation tables, all that's left to do is delete superseded libs or bailout if deps query returns !=0.
Look, there's nothing special about these concepts, all that's needed is a sound extended attribute filesystem, a kernel daemon to refresh indexes transparently and user/system tools to be aware of the API and use it! If ld.so still checks ld.so.conf the whole exercise would be useless. Would it be difficult to rip the SQL parser out of Postgres and plug it into linux?

--
Mi domando chi Ã il mandante di tutte le cazzate che faccio - Altan
Re:ext3 + sql by rtaylor · 2003-09-05 12:25 · Score: 1

Would it be difficult to rip the SQL parser out of Postgres and plug it into linux?

Well.. Probably not too bad actually. The storage manager in PostgreSQL was designed in such a way that it could be replaced (a WORM was in use at one point). It does have some bitrot, database structure is created directly, and some other items bypass the interface but those could be cleaned up.

However the Optimizer in PostgreSQL does make a number of assumptions about ease of access to data at specific times. A little work might be needed to teach it the new cost of accessing various components. Cost of an index read, sequential scan, likelyhood of data being in memory, sorts, etc.

It's not appropriate for this code to be in the kernel, as it can be quite memory intensive for complex queries -- but a thin storage manager in the kernel may be doable.

--
Rod Taylor

Re:Screenshots ? by tsetem · 2003-09-05 01:04 · Score: 2, Funny

> I know, I'm the first to look for screenshots, but antialiased filesystems are a bit too much, maybe.

Reminds me of an internal joke we have here. Our ClearCase file server was an SGI.

Why?

Because the filenames were rendered so much prettier than on a Linux or Sun box...

There needs to be a hierarchy !! by nicolas.bouthors · 2003-09-05 01:04 · Score: 1

How should I know what's in the storage in the first place if I cannot browse it ?

My .2 euros

Re:There needs to be a hierarchy !! by 110010001000 · 2003-09-05 01:17 · Score: 1

Why would browsing nescessitate a hierarchial storage system? You can browse a db if you would like.
Re:There needs to be a hierarchy !! by Anonymous Coward · 2003-09-05 01:24 · Score: 0

> My .2 euros

That's 20 cents :-)
Re:There needs to be a hierarchy !! by galt2112 · 2003-09-05 01:37 · Score: 1

I think you meant "my .02 euros"

Limitations in the home edition by yerricde · 2003-09-05 01:04 · Score: 5, Informative

What then happens to SQL as a MS product? If its built in to every OS, why then would anyone buy it.

Remember how Windows XP Home and Pro editions can serve files only to less than a dozen simultaneous clients? This is to boost sales of the IIS bundled with Windows 2000 Server and now Windows Server 2003. Microsoft SQL Server Home Edition will probably be limited.

--
Will I retire or break 10K?

Re:Limitations in the home edition by lp_bugman · 2003-09-05 04:55 · Score: 1

As a funny note. I remember the pentium 75Mhz runing Novell 4.5 used by my college some time ago to serve HUNDRES of users and more than a hundred at a time. They had about 5 or 6 servers for a community of about 4,000 students and they just to hold fairly well. NT is just not good enouch as a Server platform.

I don't think the solution is to replace the filesystem but to write BETTER networking code.

As a matter of fact I think adding an extra layer to the filesystem only means more overhead, performance and CPU cicles lost!

The only that you need is a good and fast btree system in your partition table.

I think the problem with NT is not realy it's ability locate sertain data but they way it transfers to the requesting client.

--
BSD licensed software can't be stolen....
Re:Limitations in the home edition by TheCrazyFinn · 2003-09-05 05:36 · Score: 1

Smaller files too. I remmber that too, and when the average user directory was under 5MB, even NT could handle 5000 users (Been There, done that).

My personal directoryon our corporate fileserver is currently just under 700MB. I'm nowhere near the worst offender.

The issue is that usage growth has outstripped filesystem capabilities.

--
"You've got an invalid haircut" -Warren Zevon - Life'll Kill Ya
Re:Limitations in the home edition by leandrod · 2003-09-05 07:12 · Score: 1

> adding an extra layer to the filesystem

This adds nothing, it substitutes the hierarchical store while presenting a compatible interface. It has been proven SQL if correctly modeled gives usually comparable performance to application-specific databases such as a hierarchical filesystem; it also opens the way to a relational system that could significantly better SQL performance while being more powerful and simpler.

That said, the question is how sane will be the database model... for such a fundamental stuff you'd want 6NF.

> The only that you need is a good and fast btree system

As for implementation, a bTree is OK. But here the goal is semantics: the hierarchical semantics is fundamentally broken, because the world can't be nicely organised that way. Instead, even is SQL is corrupted from the relational model, it allows a much more flexible, powerful, logical interface.

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin

Database-based File system is awesome! by Iron+Monkey543 · 2003-09-05 01:05 · Score: 1

You mean I can just do a SQL script and like magic organize my files?! OMG! My desktop won't look like this anymore

Re:Database-based File system is awesome! by Anonymous Coward · 2003-09-05 01:14 · Score: 0

well right now it looks like this

The page cannot be displayed
There are too many people accessing the Web site at this time.

Please try the following:

* Click the Refresh button, or try again later.
* Open the 24.167.92.183:8585 home page, and then look for links to the information you want.

HTTP 403.9 - Access Forbidden: Too many users are connected
Internet Information Services

Technical Information (for support personnel)

* Background:
This error can occur if the Web server is busy and cannot process your request due to heavy traffic.

* More information:
Microsoft Support
Re:Database-based File system is awesome! by Iron+Monkey543 · 2003-09-05 01:16 · Score: 1

blah IIS says that i have a limit of 10 connections and I can't change it it's hardcoded for some weird reason

Actually I think this was stolen from Star Trek... by zinkem · 2003-09-05 01:05 · Score: 0

You know, standard sci-fi talk to your computer in normal english and get results kinda crap :)

--
I can't think of a good sig...

AS400 did this 20 years ago: by +mikepb78 · 2003-09-05 01:05 · Score: 5, Informative

The filesystem on AS400 is actually a db2 database and it work quite well

Re:AS400 did this 20 years ago: by phfpht · 2003-09-05 01:18 · Score: 1

But even on the 400 most file operations are strictly hierarchical. /Root/QIBM/UserData/some/additional/path/to/file.e xt or /Root/home/username/whatever or some library list or "files"
Re:AS400 did this 20 years ago: by EarthTone · 2003-09-05 05:02 · Score: 2, Insightful

Well sure, but it hasn't successfully been done on a desktop OS yet.
Re:AS400 did this 20 years ago: by ameoba · 2003-09-05 06:30 · Score: 1

Too bad it hasn't done anything else right.

--
my sig's at the bottom of the page.
Re:AS400 did this 20 years ago: by mschoolbus · 2003-09-05 06:59 · Score: 1

Show the full respect of the AS400... Just too bad for COBOL and RPG....

Re:Ahead of the game. by Serapth · 2003-09-05 01:06 · Score: 2, Interesting

To my understanding, the delay in Longhorn's release is a result of the TrustWorthy computing initive...

This, IMHO, is a good thing. The big difference between MS and Open Source on something like this... in Open Source land, you can often see progress from day one... no matter how unstable it is. With MS, you wont see anything until the whole product is done... Not saying one is better then the other, but...

Natural language interface? Hmm... by Viol8 · 2003-09-05 01:07 · Score: 1

Not looking at screenshot 15 it isn't!

"select DISTINCT(recordid) from AttrSoup...."

Well call me old fashioned , but in my day we called that SQL. Why do I get the feeling this is
just yet another database dressed up to look like its providing some always-wanted-but-until-now-folks-it-just-wasnt-po ssible-but-with-new-WizzoFS- etc etc

Call me a cynic but I've seen it all before. Besides , databases are inefficient for manipulation filesystems at a low level so expect
your PC to crawl if you use this on a regular basis.

Re:Natural language interface? Hmm... by Androgynous+Coward · 2003-09-05 01:28 · Score: 0

I've work with one VFS filesystem through a CMS application which was powered by Sybase SQLAnywhere. There is a definite performance hit when the directory hierarchy contains thousands of entries. I concur that it has its applications but I would not want nor need my entire system to be handled in this fashion. Mind you, performance is probably improved from the implementation I am referring to so maybe I am just being a cynic.
Re:Natural language interface? Hmm... by tolan-b · 2003-09-05 01:37 · Score: 1, Informative

silly billy.

that's the sql generated by the natural language processor.
Re:Natural language interface? Hmm... by rtaylor · 2003-09-05 13:10 · Score: 1

You get it back when things get bigger though.

Large corporate filestores spend more time doing hierarchial searches looking for related information than actually retrieving files.

Many are database oriented simply because finding the information is orders of magnitude faster, even though there is a small penalty for the actual save.

Aside from that, who hasn't wanted to say they have a 5TB database installation?

--
Rod Taylor

BeFS hello?? by OmniVector · 2003-09-05 01:07 · Score: 1

It's about time we started playing catchup with BeOS's filesystem. Though this seems more user-land when a function like this (file systems) should be more kernel-land.

This is essentialy what Longhorn's taked on SQL extensions are going to provide, and I had no idea there was ongoing progress to have this functional in *nix so soon! By the time 2005 rolls around, I have a feeling this will be a lot further a long than microsoft's implementation.

*cracks whip* On zerocool, on uberh4ck3r, on coding monkey! We're catching up :)

--
- tristan

Re:BeFS hello?? by Anonymous Coward · 2003-09-05 01:16 · Score: 0

Hello BeFS how are you today? I like what you did with your hair, pity about the poor business decisions

filesystem is a database by 1s44c · 2003-09-05 01:07 · Score: 0, Flamebait

So it's about that time again. The old database is a filesystem is a database thing. To people that have not heard of this before this is not a new idea. OS/390 has been doing this for a very long time.

What this world needs is a really big injection of orginal thought, not a rehash of every idea that gets old enough for most people to forget it the first time around.

Re:filesystem is a database by Viol8 · 2003-09-05 01:12 · Score: 3, Insightful

"What this world needs is a really big injection of orginal thought"

They are original ideas, they just don't make it into the PC world where MS dominates. MS come up with as many original ideas as McDonalds
and since all KDE & Gnome (and frankly most open source projects) are doing is playing catchup with MS then originality is never going to be
a prime concern.
Re:filesystem is a database by salesgeek · 2003-09-05 01:22 · Score: 1

They are original ideas, they just don't make it into the PC world where MS dominates. MS come up with as many original ideas as McDonalds
and since all KDE & Gnome (and frankly most open source projects) are doing is playing catchup with MS then originality is never going to be
a prime concern.

Well said. Invent and innovate, don't just duplicate. I would love to see MS complain, "Those OSS guys have extended the ____ (fill in MS Standard) standard to do _____, ____ and _____. We just can't keep up with them."

--
-- $G
Re:filesystem is a database by 1s44c · 2003-09-05 01:24 · Score: 1

"...are doing is playing catchup with MS then originality is never going to be a prime concern."

I don't think most open source is playing catch up with that third rate vendor you mention. Opensource software is of higher quality and for more compatable with everything.

A tiny amount of originality can go a huge way though. It can make the differance between a drone army of tens of thousands average to great programmers producing crap ( microsoft ) and a hundred or so below average to great coders producing quality stuff ( open source. )
Re:filesystem is a database by Anonymous Coward · 2003-09-05 01:44 · Score: 0

"...not a rehash of every idea that gets old enough for most people to forget it the first time around."
Therein lies the rub, most people who are really interested in pushing OSS are youngsters. That's not to say there aren't some grey-beards who contribute, but the young bucks are the ones with the time and energy to pick up a project and run with it. So they end up not knowing history and not having time to learn history. Here's a question: Is it wrong that most CS curricula have absolutely no history component? For example, new guy in the office just finished MS in CS, wondered if those books on my shelf by that Knuth guy he'd never heard of were any good. The current state of CS education is focused entirely on what is new and hip. It's akin to what a MS/PhD level physics program would be if all classical mechanics and E&M were eliminated and only quantum field theory were taught. Or a history degree covering 1800-2000 but without any requirement for even a basic overview course of the Reformation.
Re:filesystem is a database by Viol8 · 2003-09-05 01:59 · Score: 1

"I don't think most open source is playing catch up with that third rate vendor you mention"

I'm sorry , but it is. Look at KDE , what does it remind you of? My god , couldn't they even have come up with an original
way of laying out their logo in the start menu?? What about Ximian? Looks like Outlook to me. OpenOffice? Well , the clues in the name.
I'm sure there are less well known open source projects that are doing something original but lets face it , the majority are either just copying MS stuff or
some other commercial product by another company. Considering the amount of flak MS gets from the OSS community seems like a large
dose of hypocracy to me. And I'm not a Windows user at all , I use Linux & FreeBSD at home but I'm really sick of just seeing copycat stuff,
I want something original. Even my favourite window manager AfterStep is just a clone of something Steve Jobs designed in 1985!
Re:filesystem is a database by Xrikcus · 2003-09-05 02:22 · Score: 1

Quality, incomplete stuff, to MS's slightly lower quality but much more polished stuff that you don't find missing half the features you want.

Openoffice, for example, is still missing adequate drawing tools.

on the other hand as soon as I can work out why my wireless network card locks up linux on my notebook, I'll be not bothering with windows at all.
Re:filesystem is a database by 1s44c · 2003-09-05 02:39 · Score: 1

Openoffice, for example, is still missing adequate drawing tools.

GIMP?

Just because MS puts drawing tools into an office suite doesn't make it a good idea. The two things should be seperate.
Re:filesystem is a database by Xrikcus · 2003-09-05 03:03 · Score: 1

Debatable. I don't count GIMP because it's not for simple manipulable vector graphics.

if you can suggest a good one I'll take a look though, as it is I may try running flash under wine and see if that works.

Let's say I want an arrow between two paragraphs... GIMP'll be useful for that.
Re:filesystem is a database by Xrikcus · 2003-09-05 03:08 · Score: 1

(and wouldn't OO's "Draw" program have been a better suggestion than GIMP anyway?)

It was just a random example anyway, and a fairly poor one as it does have drawing tools built in of a sort.

The point remains that there are many, many open source projects that get started, but never really complete, think about the number that are still in pre 1 releases.

Here I am arguing and it's hardware support that's restricting my use of Linux at the moment, which isn't really kernel maintainers fault, it'd be nice if hardware manufacturers sorted the drivers.
Re:filesystem is a database by Anonymous Coward · 2003-09-05 10:33 · Score: 0

No one gives a shit if an idea is original. What matters is if it is good.

Dubious contents in the filesystem? by Anonymous Coward · 2003-09-05 01:07 · Score: 0

Check the screenshots: How exactly do you get all those movies on disk without doing something "illegal"?

I18n? by MSBob · 2003-09-05 01:07 · Score: 1

This is all great but how is this project going to work with languages other than English?

--
Your pizza just the way you ought to have it.

Re:I18n? by tolan-b · 2003-09-05 01:40 · Score: 1

with a new dictionary and language ruleset?
Re:I18n? by zanderredux · 2003-09-05 06:05 · Score: 1

en: SELECT users WHERE clue > 0 ORDER BY clue ASC; pt-br: SELECIONE usuarios QUE TENHAM pista > 0 ORDENE POR pista > 0 ASC; awww... looks awful!

Storage by SuperBanana · 2003-09-05 01:08 · Score: 1

OSNews is reporting on Storage, an innovative project which aims to replace the traditional hierarchical filesystems with a new document store which is database-based (PostgreSQL).

I have a new way to get between point A and B. I call this product "Car". To fuel it, I've started a fuel company called "Gas". Of course, people will abuse "Car", so I've also created something to keep them in line, called "Fuzz". Fuzz will be powered by what I call "Donut".

(hey, it's Friday, gimme a break :-)

--
Please help metamoderate.

Re:Storage by Anonymous Coward · 2003-09-05 02:20 · Score: 0

(hey, it's Friday, gimme a break :-)

OK, here's a break. Now give me that donut.

Re:Screenshots ? by Xenoproctologist · 2003-09-05 01:10 · Score: 0, Funny

Antialiased filesystem? Is this eight bits to describe 128 shades of "0" and 128 shades of "1", or is it one binary bit plus an eight-bit alpha channel?

Damn, I knew this hard drive space "age of plenty" was too good to last. Curse you, Moore's Law, for taunting me so!

Patents? by SynKKnyS · 2003-09-05 01:10 · Score: 1

Is this method of finding and storing files patented?

Why link directly againsat libpq? by Wdomburg · 2003-09-05 01:10 · Score: 2, Insightful

It seems silly to tie the implementation to a single database, when gnome-db is fast approaching 1.0.

Re:Why link directly againsat libpq? by Anonymous Coward · 2003-09-05 01:34 · Score: 0

The abstraction to other databases (Reiser4) can occur in a later stadium.
Re:Why link directly againsat libpq? by rtaylor · 2003-09-05 01:41 · Score: 3, Informative

Their feature list say it will work with Oracle and other SQL99 compliant databases, so I would assume it isn't linked against libpq directly.

--
Rod Taylor
Re:Why link directly againsat libpq? by Wdomburg · 2003-09-05 02:31 · Score: 1

The feature list says that the query set is restricted to SQL99 and Oracle *could* be used. However, if you download the code they point to in the CVS repository, it links to libpq, and not a generic library like libgda. :)

I suppose its possible they're working on a generic backend, but a grep of the repository makes no mention of anything but Postgres that I could see.
Re:Why link directly againsat libpq? by rtaylor · 2003-09-05 12:17 · Score: 1

I see, thanks for the correction. I'm waiting for it to appear in the ports tree before I install it ;)

--
Rod Taylor

Re:Backups by +mikepb78 · 2003-09-05 01:11 · Score: 1

Then use db2/oracle/mysql etc. Porting should not be that hard.

Metadata by eiggen · 2003-09-05 01:12 · Score: 1

This idea really is interesting!

However, I wonder how it will take metadata from files and put it into the DB directly... like for ID3 tags for example...

Will there be intelligent plugins a la Reiser4 that will transparently do this?

Re:Metadata by Jesus_666 · 2003-09-05 02:05 · Score: 1

You will probably get to choose between...

...a plugin that scans all files you open for (updated) metadata, but comes at the small cost of making all FS accesses 500% slower...
...a plugin that only makes the FS 20% slower but doesn't care about metadata at all...
...a plugin that instantly produces a segfault and screws up the database and...
...several thousand promosing eternal alpha versions.

--
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)

Re:Backups by joostje · 2003-09-05 01:12 · Score: 1

You still cannot reliably backup PostgreSQL databases

What's wrong with pg_dump(1)?

Re:Backups by realnowhereman · 2003-09-05 01:13 · Score: 1

I'm hoping that you are a bit out of date, pgdumpall works fine for me. Since about 7.1 it's done large objects as well. I'm a bit worried that it's not working fine for me and I'm living an illusion. What exactly does it not back up reliably?

--
Carpe Daemon

i think-Oracle. by Anonymous Coward · 2003-09-05 01:14 · Score: 0

"But I think that the result will probably be less resilient to damage and result in an increased possibilty of losing your data or finding them corrupted."

Well someone better phone up Oracle and let them know. Oh the horror.

Re:i think-Oracle. by Tirel · 2003-09-05 01:31 · Score: 2, Funny

listen here bud, we're just trying to whore some honest karma, don't go at it with your "facts" and shit.

thanks for understanding.
Re:i think-Oracle. by rwise2112 · 2003-09-05 07:21 · Score: 1

It's funny, cause it's true!!!!

--

"For every expert, there is an equal and opposite expert"

ok, by noselasd · 2003-09-05 01:14 · Score: 1

wtf is this going to help ? People are as much boneheads in putting the right attributes/keywords on a "file" as they are to categorize them in a hierarcy(traditional filesystems). Why is this so great ?

Re:ok, by Jesus_666 · 2003-09-05 02:08 · Score: 1

Because the marketing guys say so.
Everyone knows that marketing people always say the truth and know everything.

But of course, no one is going to use this stuff, since everyone will migrate to Windows once Longhorn comes out.
I know it's true, a Microsoft marketing guy told me...

--
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)

I spot a pirate! by meshko · 2003-09-05 01:14 · Score: 1

U2 albums, Spielberg movies, Nicole (Kidman?) movies... I think someone is in trouble!

As for the idea of database file systems -- I don't think we need this yes. Both file systems and database research should concentrate on distributed /mobile aspect (even Coda, AFS and friends are not yet widely accepted/ready for prime time).

--
I passed the Turing test.

Everyone is getting into the act? by VernonNemitz · 2003-09-05 01:15 · Score: 1

Old Slashdot articles:
one guy
How-To
OpenBeOS

woot woot by VAXGeek · 2003-09-05 01:16 · Score: 1

I hope they add a DBI soon so I can store all my files in a CSV file.

--
this sig limit is too small to put anything good h

Re:woot woot by forsetti · 2003-09-05 01:19 · Score: 1

But where would you store your CVS file? ;)

--
10b||~10b -- aah, what a question!

ReiserFS future by realnowhereman · 2003-09-05 01:16 · Score: 1

I personally like the way reiserfs is roadmapped. If I understand it correctly it will be a superset of existing filesystems. That is /home/myname/documents/report/2003/ will still work, but then so will /documents/reports/2003/myname; and so on.

Multiple paths to the same object seems perfect to me.

--
Carpe Daemon

Re:ReiserFS future by Sphere1952 · 2003-09-05 01:41 · Score: 1

You mean sort of like using ln?

--
Big Brother Bush is doubleplus ungood.
Re:ReiserFS future by realnowhereman · 2003-09-05 01:48 · Score: 1

No, not like using ln. Making a symbolic link requires that I've already anticpated that someone needs a file in a different path. This, in effect, removes the tree structure and makes every filesystem object just a series of interconnected nodes. Have a read of Hans Reiser's whitepaper on the subject. I was most impressed.

--
Carpe Daemon

leisure suit larry reborn? by steffl · 2003-09-05 01:18 · Score: 1, Funny

great, so now navigating the data will be like playing leisure suit larry! at least that's what I was reminded of when I read the queries in screeshots...

erik

--
...all excited, don't know why...

UPDATE overlords by Alien+Being · 2003-09-05 01:18 · Score: 0

SET welcome = TRUE
WHERE name = 'PostgreSQL';

Re:UPDATE overlords by Anonymous Coward · 2003-09-05 01:43 · Score: 0

dont' forget the great improvement in decreasing customer support needs

select user
from user_tbl
where username = "__entered value__"
and not user = "dumbass"

Re:Backups by Florian+Weimer · 2003-09-05 01:18 · Score: 1

What exactly does it not back up reliably?

It sometimes dumps database objects in the wrong order, and restore fails as a consequence.

Re:Ahead of the game. by smittyoneeach · 2003-09-05 01:20 · Score: 1

No, Longhorn is about putting SQueaL Server in between the user and the file system to cock-block any attempts at mounting the filesystem without paying a vig to Redmond.
Longhorn is aptly named. Brings to mind the ithyphallic eidolon from Schrodingers Cat Trilology by Wilson.
Frank Zappa's advice comes to mind: "Keep it greasy, so it'll go down easy".

--
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear

"Damn, I left that on my roommate's desk" by kfg · 2003-09-05 01:21 · Score: 5, Insightful

"Well, where do you go?"

"Stanford."

"No problemo, I'm heading that way later and I can grab it for you. What's your room?"

"Dorm 5, Room 109. It's the desk on the left."

( We didn't bother to state earth.us because we were already inside those directories)

Yes, yes we do think heirarchically. Most of the history of human thought has been fitting everything we can lay our filthy little brain cells on into heirarcheis, whether they wish to fit into them or not. It's intuitive.

As for natural language didn't we learn about that with COBOL? Natural language only speeds the learning process slightly ( the majority of the learning still lying in the realm of understanding the basic concepts involved), but then becomes a pain in the ass forever afterward.

Looking at the screenshots it's also ugly as all sin. The physicist in me can't help but feel that a model that ugly can't possibly be correct.

I think this makes just about as much sense as using a document preperation language (XML) as the basis of a database.

Which is to say, none.

KFG

Re:"Damn, I left that on my roommate's desk" by Kingpin · 2003-09-05 01:46 · Score: 1

I think this makes just about as much sense as using a document preperation language (XML) as the basis of a database.

Your picture is turning upside down. It's the forecast that there will be a large amount of semi-structured data (XML) that's resulted in lots of the DB initiatives for efficient storage mechanisms for this kind of data. It's inefficient to store this type of data in a relational DB. Consult http://www.rpbourret.com/xml/XMLAndDatabases.htm for an introduction to the topic.

--
Unable to read configuration file '/bigassraid/htdig//conf/14229.conf'
Geocrawler error message.
Re:"Damn, I left that on my roommate's desk" by lawpoop · 2003-09-05 01:48 · Score: 3, Interesting

Human beings can and do think heirarchically, but that doesn't mean it's the end-all-and-be-all of organization.
I think the examples he shows are pretty good. In my mp3 collection, I would like to see "All bluegrass songs" or "all remixes of Parliament Funkadelic stuff". How do you propose to do this in a hierarchical filesystem? Most of my bluegrass artists are under 'bluegrass', but then there are some bluegrass songs that were in non-bluegrass artists and albums folders.
In my workplace we are having the same problems. On our shared folders, we have shipping documents in each clients' folder. But then, what if we what to see all shipping documents from a particular vendor? Currently, we would have to go into each customers' folder (which are also broken down by year archives) and grab all documents which *might* be from said supplier, and then open each one, and look to see, because the supplier name isn't in the filename. It's horribly broken, which is why we are moving to a database storage system for such documents.

--
Computers are useless. They can only give you answers.
-- Pablo Picasso
Re:"Damn, I left that on my roommate's desk" by kfg · 2003-09-05 01:58 · Score: 1

See http://www.dbdebunk.com/page/page/606457.htm for the refutation, but I warn you, you'll need a good grasp of set theory and mathmatical logic to truly understand it. I don't mean to be snide or demeaning ( I'm not above it, but in this case I really don't), but if you don't have the mathmatical background you aren't in a position to understand the field.

Storage at least has the advantage of being semi-relational.

KFG
Re:"Damn, I left that on my roommate's desk" by sw155kn1f3 · 2003-09-05 02:00 · Score: 1

you could just use symbolic links
and make a simple UI for that - that will work just fine (look at how sysV startup scripts in redhat get linked)
moving to a database will be a pain, and for a while you won't get anything useable, no doubt

--
- Arwen, I'm your father, Agent Smith.
- Well, you're just Smith, but my father is Aerosmith!
Re:"Damn, I left that on my roommate's desk" by Chep · 2003-09-05 02:01 · Score: 1

find, xargs, egrep and if need be one or two text extraction tools adapted to the file formats at hand (if all else fails, strings).
Re:"Damn, I left that on my roommate's desk" by tesmako · 2003-09-05 02:05 · Score: 1

The only thing COBOL showed about natural language is that naming things in a computer language after natural language words is ultimately just confusing since the language is nowhere closer to actully accepting natural language because of that. There is a lot of difference between accepting natural languages and fooling people into thinking you do and then failing to interpret what they say.
Re:"Damn, I left that on my roommate's desk" by kfg · 2003-09-05 02:14 · Score: 1

Yes, I understand that issue fully. I'm reminded of the time NASA was looking for an island in the middle of the South Pacific ( you know, that part that's nothing but but blue when you look at it on a globe) that would be visible to the eye from the space shuttle. They pored over sattellite photographs and were coming up empty.

Eventually a libraian heard about the problem and refered them to a book on . . .birds.

She remembered reading a description of a completely unique island in a bird book.

NASA went back to the photos and sure enough, there it was.

The thing is that relationship existed in someone's mind, not in a database. It seems unlikely that if a database file system had existed at the time that they would have found it there either. A database is still completely dependant upon the relationships being built by a person and the computer can't make the leap of insight to build new and unsuspected relationships.

I myself use simple databases to keep track of which songs can be found on which albums by which artists ( I'm a folk singer so I have dozens of recordings of various versions of the same song by different artists, so I'm painfully aware of the issue), but I don't use this to replace my file system, I use it suplement it.

And I only use my computer's system of oranization to supplement the one I carry in my brain.

When I just want to listen to some Steeleye Span I can still just go to home/music/steeleye_span and have at it.

KFG
Re:"Damn, I left that on my roommate's desk" by Alien+Being · 2003-09-05 02:55 · Score: 1

I'm not ready to let go of a hierarchical organization to the fs, but I am anxious to see it extended.

Let it be relational, just make it so the metadata for every file is required to have a "pathname" attribute. If you don't want to assign a pathname explicitly, then it should default to something analogous to $PWD/inode_number.
Re:"Damn, I left that on my roommate's desk" by cbovasso · 2003-09-05 03:16 · Score: 1

If I were to think of how I sort/categorize objects/data in my head I would use the example of putting clothes away in my drawers. I tend to put clothes together with similarities. I don't have a chest of t-shirts and then each drawer has a color or style of t-shirts. I put all my white, short sleeved tshirts together. They could be next to my colored polo shirts, or my pants. So the store isn't how I find things, its the "database" in my head.

I dont know where I am going with this but its my two cents on how *we* innately relate to things.

--
I ask for a car and I get a computer. How's about that for being born under a bad .sig?
Re:"Damn, I left that on my roommate's desk" by Grizzlysmit · 2003-09-05 03:23 · Score: 1

I'd have to go along with you, I think this has some utility, and could lead to some nice add on's the the desktop, but basically it's a dead end. A saner approch would be to icorporate alternate indexing/navigating schemes into the filesystem, it sounds dangerously like their going at this rather too fast too, they should prove it works first, else this could kill gnome :-( .

--
in my life God comes first.... but Linux is pretty high after that :-D
Francis Smit
Re:"Damn, I left that on my roommate's desk" by nosferatu-man · 2003-09-05 05:18 · Score: 1

No reason you couldn't have your "find" as a command-line tool that can translate between the current (miserable) semantics and the underlying representation. Not hard at all.

Relying on tools crafted to the lowest common denominator (the Unix file "system") is so thirty years ago. Nothing about the organization of the computer imposes this stilted model with all the attendant brain damage it implies; we can and should move past it.

'jfb

--
To spur "enterprise Linux," Big Bang, the distributed two-phase commit.
Re:"Damn, I left that on my roommate's desk" by ediron2 · 2003-09-05 05:21 · Score: 1
Yes, yes we do think heirarchically. Most of the history of human thought has been fitting everything we can lay our filthy little brain cells on into heirarcheis, whether they wish to fit into them or not. It's intuitive.
...
I think this makes just about as much sense as using a document preperation language (XML) as the basis of a database.
Which is to say, none.

Therefore, we shouldn't try to stretch?!
1 - I remember reading of certain south-pacific islanders that had incredible navigational skills for island hopping, but that couldn't recognize 2-d representations of 3-d objects. A picture of a cube is baffling to them. Similarly, Object-orientedness is something I have to teach people via analogy. Once taught, people see object classes everywhere.
2 - Everything I learned in physics seems to have focussed around unlearning common misconceptions and overly vague and innacurate common-man understandings. Once I treated physics like semantics, forcing myself to learn the one-true-definition for things rather than assuming I knew it, classes were a breeze.
3 - Mimsy were the Borogroves. Henry Kuttner (1943 as L. Padgett). Argument via sci-fi is a lame thing, but as world knowledge advances, even the dumbest among us learns stuff that *nobody* knew just decades before. As a teacher, I use analogies (like trees) to help people learn. But it is easy to create an analogy for an object oriented dataspace (Imagine a jumble of little chunks of stuff and tools that let you pick out JUST THE STUFF YOU WANT). In fact, google pretty much removed my need to have the internet behave as a giant set of tree structures. Libraries have been quite successful for a long time on a cross-index mechanism (and reference librarians!). I absolutely LOVE wiki-webs, because they let data morph and cross-link itself as users see fit.
Face it: data doesn't like to be in trees. We like to try to shoe-horn it into trees. Then we end up with overlapping trees because of multiple interpretations of data organization. Hell, my email doesn't even fit nicely into trees. In an object-oriented world, I'd enjoy mailspaces that I share with others in my project.
Recast your conversation: Damn, I left that on my PC.
- Well, give me your IP and I'll download it.
- Well, just IM your roommate and we'll work out how to get it.
- Check the signup sheet and see if anyone else is coming from there.
- Next time, use Freenet. Then we could just grab it off there.
- Anyone in your dorm got a scanner?
- What's the GPS coordinates?
- Man, I can't even find Stanford, and then I'll need a map to find your dorm, and without a student ID and your key I'll never get to your desk.
- ...etc
Re:"Damn, I left that on my roommate's desk" by matman · 2003-09-05 06:15 · Score: 1

We do use heirarchies to organize some concepts, but there's a lot more to it. There are aspects of context, relation, inheritance, etc that are not expressable via simple heirarchy.

Check out the "protege" project - it's an ontology editor. Ontology is a related problem. http://protege.stanford.edu/

There are a lot of problems in file system design, many of which are unsolved. The field is still really very young.
Re:"Damn, I left that on my roommate's desk" by Kingpin · 2003-09-05 10:33 · Score: 1

Your URL points to nothing informative on the subject. Don't worry, I have the appropriate background.

--
Unable to read configuration file '/bigassraid/htdig//conf/14229.conf'
Geocrawler error message.
Re:"Damn, I left that on my roommate's desk" by CTho9305 · 2003-09-05 16:10 · Score: 1

I think that some things, though, would be better in a database - for example, music could be categorized better in a db. I think that in general, things that fit in more than one place are best kept in databases, but the rest fit better in heirarchies. Papers I write, for example, usually only belong in one place.

--
My server
Re:"Damn, I left that on my roommate's desk" by Tablizer · 2003-09-05 17:57 · Score: 1

Yes, yes we do think heirarchically. Most of the history of human thought has been fitting everything we can lay our filthy little brain cells on into heirarcheis

"Naturally" is a bit of a stretch IMO. The alternative to trees is generally "sets", and there is some "naturalness" to sets also. However, people are not trained in sets the way that they are trained in hierarchies. People do catch on to hierarchies relatively quick, I agree. But trees have limits in ability. I don't know how the average office computer user would take to sets. It has not been tried on a large scale that I know of. Personally, I would love to see the END of tree-based file systems, but I can't speak for every computer user out there. Further, a home system can possibly get by with trees better than an office network. Trees run into problematic categorical limits after about 4-levels deep or more than a few thousand nodes.

--
Table-ized A.I.
Re:"Damn, I left that on my roommate's desk" by Shadowlore · 2003-09-05 19:46 · Score: 1

Yes, I understand that issue fully. I'm reminded of the time NASA was looking for an island in the middle of the South Pacific ( you know, that part that's nothing but but blue when you look at it on a globe) that would be visible to the eye from the space shuttle. They pored over sattellite photographs and were coming up empty.

Eventually a libraian heard about the problem and refered them to a book on . . .birds.

She remembered reading a description of a completely unique island in a bird book.

NASA went back to the photos and sure enough, there it was.

The thing is that relationship existed in someone's mind, not in a database. It seems unlikely that if a database file system had existed at the time that they would have found it there either. A database is still completely dependant upon the relationships being built by a person and the computer can't make the leap of insight to build new and unsuspected relationships.

"Show me all islands in the central part of the south pacific large enough to be visible from low earth orbit".

Can that be done based on data about the islands? Yes, all of that can be determined. We can mathematically determine relative size as viewed at that distance. We can use coordinates to determine islands that are at that size or larger in a given box or region.

So must disagree and say that such a database/interface *could* have found that, without needing to know about birds, simply the size and locations of the islands, and a query as above. Sometimes the answer lies not out of the box, but *in* the box. The trick is not to learn to think outside of the box, but in *both* places.

--
My Suburban burns less gasoline than your Prius.

This is awesome! :) by Trolling4Dollars · 2003-09-05 01:21 · Score: 1

Quite a while back I made suggestions like this indicating that this is the direction that most of the computer industry is moving in. I am certain that I am not the only one that was thinking this way and this project proves it. The natural language feature is something that surpasses my original concept of predefined end-user datatypes (movies, music, documents, mail, etc...). The only drawback I can see to this kind of system is the amount of horsepower needed to run it. But as long as the requirements stay below those of M$, I think all will be well. :)

--
Un-news

Obligatory BeOS comment by The_Hun · 2003-09-05 01:22 · Score: 1, Redundant

And do not forget about BeOS, a pioneer of a database-like FS. There is also a BeFS for Linux.

--
Sig. under reconstruction.

About time by varjag · 2003-09-05 01:24 · Score: 1

Current filesystems are nothing more than hierarchial databases. While relatively straightforward to implement, hierarchial DBs have major drawbacks, e.g.

- Complex searches are slow;
- Integrity control is hard;
- There's no decent way to refer to an item in several distinct branches (hence the kludges like symlinks in filesystems).

Database world has been moving from hierarchal to relational DBMS since the late 70's. It's about time for filesystems to catch up.

--
Lisp is the Tengwar of programming languages.

Re:Ahead of the game. by mirko · 2003-09-05 01:25 · Score: 1

I guess you meant BFS, the BeOS filing system ?
I worked with it and it was really good.
The guy that wrote it also wrote a very good file system design and implementation manual which I can't find anymore on O'Reilly's web site...

--
Trolling using another account since 2005.

Random thought for the day... by fluxrad · 2003-09-05 01:26 · Score: 5, Funny

Am I the only one that isn't totally into the idea of "googling" data on my hard drive?

Granted, it's mostly pr0n on there, so it's almost the same thing, but still...

--
"It is seldom that liberty of any kind is lost all at once." -David Hume

Someone tell me why? by Anonymous Coward · 2003-09-05 01:27 · Score: 0

I imagine there is some benificial reason for doing this, could someone explain? Seems like the time required to attribute all the information to each file in order to get an advantage in finding your files would be more work than just building a logical directory structure with many sub-directories and symlinks. (We are talking about a databased file system, and not just a database of file information right?) Surely that can't be all to fast? Is it faster for me even if I know where I put my files?

Re:Backups by realnowhereman · 2003-09-05 01:27 · Score: 1

Got a link? What version did you experience that in? I have had a problem with the restores before but I don't think it was that. The problem I had was that the extra stuff that gets added to template0 gets backed up as well. That means that when you try to do a restore you get loads of errors that things already exist. The solution is to use template1 as the basis for the target of the restore, which really is the empty database.

--
Carpe Daemon

Ooops I'm back to front! - db/fs v. fs/db by SomeBloke · 2003-09-05 01:27 · Score: 1

I'm developing an embedded platform using the flash filesystem and I'm implementing the database on the filesystem. Hopefully, no one ports a layer such as this on top of the database to provide a filesystem ...

I feel a "GNU is Not Unix" coming on!

Not SQL Server Directly by Watts · 2003-09-05 01:28 · Score: 5, Informative

Having SQL Server as the underlying filesystem technology doesn't mean that you're going to be running SQL Server directly. I mean, if you currently use NTFS, there isn't a NTFS daemon that the kernel connects to when it does filesystem transactions. Just like every other filesystem, the support will be built into the kernel. Instead of writing data as NTFS does, the structure will look a lot more like how SQL Server stores data -- with built in indexes, etc.

Many database servers already have some fairly optimized code when it comes to file access. This just implements it at the kernel level, rather than having it sit on top of a traditional fs.

Re:Not SQL Server Directly by Serapth · 2003-09-05 01:33 · Score: 1

Odd, I was under the impression that in Longhorn the file system wasnt actually going to be SQL based... but would remain NTFS. I had thought that SQL or a SQLlike layer was going to run overtop, as an interface/organiser. In which case, it would run basically as a service/daemon...
Re:Not SQL Server Directly by b!arg · 2003-09-05 03:15 · Score: 1

Hmmm...I'm just imagining a world with this SQL Server filesystem on every desktop and the Slammer worm.

--

Everybody dies frustrated and sad and that is beautiful
Re:Not SQL Server Directly by PainKilleR-CE · 2003-09-05 03:27 · Score: 1

Even more odd, the kernel in WinNT/2k/XP interfaces the file system through a driver, whether it's NTFS, FAT, FAT32, or whatever odd file system might have an NT driver available for it (ie the file systems originally supported on Alpha and PPC platforms, perhaps). So the idea of building a file system into the kernel seems pretty backwards from the current (and past) NT design standpoint.

--
-PainKilleR-[CE]

Re:Backups by realnowhereman · 2003-09-05 01:29 · Score: 1

Is this what you're referring to:

pgsql-bugs ( at ) postgresql ( dot ) org writes:
> The problem occurs for new data types:
> When pg_dump is called, sometimes the CREATE TYPE is dumped before
> input/output functions are dumped. This makes a restore impossible.

I believe this was fixed about two weeks ago. Are you sure you are
using 7.1 final release, not some beta version?

If it is then your attack on postgresql seems a little unfair.

--
Carpe Daemon

BeFS by laird · 2003-09-05 01:32 · Score: 3, Informative

Actually, Be had two flavors of "filesystem as database" in widespread deployment. OK, not as widespread as Windows, but certainly thousands of users. The first version of Be's filesystem, by Benoit Schillings, was very database like, but performance was so-so. The second version of BeFS, by Dominic Giampaolo, was less general in implementation, but had the same metadata-driven capabilities. There's an interesting article on this at http://www.theregus.com/content/4/24485.html. Basically, Be did everything that this project is talking about, years ago. That's not to take anything away from the project -- it's cool if more mainstream operating systems catch up to the innovations of niche players, because more people benefit. Dominic is working at Apple, so there's hope that MacOS X's filesystem will start incorporating the rich-metadata, dynamic view model of the world. And while MS has (I think) pushed the "filesystem as database" out of the next version of Windows NT/XP/whatever, it's still planned for the next version after that, so perhaps in a deade or so we'll all be able to do what Be did back in '91. And of course, Palm owns the Be code, so perhaps PalmOS will lead the way?

--
Enable 3D printed prosthetics!

Re:BeFS by Nutcase · 2003-09-05 01:44 · Score: 1

"perhaps in a deade or so we'll all be able to do what Be did back in '91"

perhaps much sooner.
Re:BeFS by blibbleblobble · 2003-09-05 01:52 · Score: 1

"so perhaps in a deade or so, Windows will be able to do what Be did back in '91"

Is that how long it takes for a patent to expire?
Re:BeFS by Anonymous Coward · 2003-09-05 01:57 · Score: 0

No, the original poster was right. Don't hold your breath for anything from OBOS any time soon.
Re:BeFS by cpeterso · 2003-09-05 09:06 · Score: 2, Insightful

there's hope that MacOS X's filesystem will start incorporating the rich-metadata, dynamic view model of the world.

you mean like Mac OS 9 and earlier?

--
cpeterso
Re:BeFS by the_greywolf · 2003-09-05 09:50 · Score: 1

yes, but hopefully a little more scalable than HFS turned out to be.

--
grey wolf
LET FORTRAN DIE!
Re:BeFS by laird · 2003-09-08 05:56 · Score: 1

Well, HFS has nice metadata (and it's still in Extended HFS in MacOS X!) but it's not dynamic -- you find files by looking in directories, rather than by asking the OS for things that match criteria.

To illustrate, in BFS you can see new email by asking the OS for things that are of type email that have the attribute unread=true, and get back files containing email messages, with each email message having metadata for sender, receiver, subject, date, etc. The dynamic part is that there's no 'email folder' -- it's all just a database query, and if any other application created new email files, they'd immediately appear in the window with your other new email.

In HFS, you could have a file for each email message, and store the metadata in the resource fork. But you couldn't use the metadata to find the file. NTFS has a vaguely similar metadata capability, but nobody uses it.

--
Enable 3D printed prosthetics!

Re:oral sex with my girlfriend by Anonymous Coward · 2003-09-05 01:32 · Score: 0

Sadly, contrary to urban myth, anal sex is not an effective form of birth control as sperm can easily make their way through...

Is no one else sad? by 330Pilot · 2003-09-05 01:32 · Score: 1

I actually like "earth.us.stanford.dorm5.room109.desk2" rather then "My roommate's desk" !

Who has so many files anyway? by Chemisor · 2003-09-05 01:33 · Score: 1

Would someone who has 100,000 files please provide us with examples of what kind of files they are? I simply cannot imagine why anyone would have such a great number. Without examples it is nearly impossible to imagine any use for such a filesystem.

Re:Who has so many files anyway? by MagicBox · 2003-09-05 02:07 · Score: 1

I for one have 60 GB worth of files in one of my hard drives. Collected withing the last 5 years, most of them are programming files, html articles, pdf files, text files, database files, Office files etc etc etc. The hierarchy of directories is almost unmanageable. Last count I had over 50,000 files stores there, and a lot of them are entire compressed directories with even more files inside. I assume there's a lot more people that have probbably 10 times more files than me. I think this would be an amazing thing to have.

--

The phaomnneil pweor of the hmuan mnid. Fcuknig amzanig eh!
Re:Who has so many files anyway? by Anonymous Coward · 2003-09-05 02:17 · Score: 1, Funny

Would someone who has 100,000 files please provide us with examples of what kind of files they are? I simply cannot imagine why anyone would have such a great number. Without examples it is nearly impossible to imagine any use for such a filesystem.

Pr0n, of course!
Re:Who has so many files anyway? by Jesus_666 · 2003-09-05 02:54 · Score: 1

Your post made me realize that it would indeed be interesting to see what the incredible amount of digital detritus on my hard drive consists of, so I scanned the E partition of my Win machine (which I use for gaming/working/almost everything).

According to JDiskReport, my E drive looks like this:

Number of files: 93.301
Number of files in the RECYCLER dir because I didn't think of emptying it before scanning: 1.194
File types taking up the most space by size: AVI Video (.avi, 13.5 GB), CD Image (.iso, 9.5 GB), CD Image (.bin, 7.5 GB), RAR Archive (.rar, 5.6 GB), MP3 Audio (.mp3, 5,1 GB), CD Image (.img, 4.7 GB), MPEG Video (.mpg, 2.4 GB), Unreal Textures (.utx, 2.3 GB)
Top 6 folders taking up the most space: Programs folder (30.8 GB), Games folder (24.2 G), Music folder (6.2 GB), Download folder (4.6 GB), pseudo-temporary folder (4.4 GB), RECYCLED (2.3 GB)
The biggest file is a VMWare virtual disk with a size of 1.1 GB.
The oldest files (except for 1/1/1970-bogus stuff) belong to a copy of STUNTS that I didn't know I still had.
The newest file (w/o bogus) is my eMule's preferences file.

The scan took about 15 minutes.

Wow, with the Info I gathered just because I wanted to reply to your post I've been able to locate about 8 GB of useless junk I've forgotten about.

So, as the example proves, a database-driven FS is indeed best suited for people who put buttloads of useless junk on their HDD and then forget about it.
Yay for Storage! ^_^

--
USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
Re:Who has so many files anyway? by cgh4be · 2003-09-05 04:01 · Score: 1

Just did a "find / -print | wc -l" on my Linux server. 133308 files.

Old hat by semanticgap · 2003-09-05 01:34 · Score: 1

I remember reading about this idea a few years ago, it may have been on /. If it didn't catch on then, it probably won't now.

BTW - there is an operating system that uses a database instead of a file system, it's called IBM OS/400.

--
grisha.org

Great for Law firms by DrSoCold · 2003-09-05 01:35 · Score: 1

Legal firms now have no reason to ignore Linux as an alternative to Microsoft bloatware. A decent Linux based, Indexable, Document Management System implementation is the final piece of the puzzle. Maybe...

Re:About time-Set Theory. by Anonymous Coward · 2003-09-05 01:35 · Score: 0

"Database world has been moving from hierarchal to relational DBMS since the late 70's. It's about time for filesystems to catch up."

How about a database filesystem based on set theory?

Enterprise connection by daBass · 2003-09-05 01:35 · Score: 1

Now we can finaly feel like we are aboard the Enterprise by going through our "personal database"!

Yes, but why an RDBMS by nut · 2003-09-05 01:35 · Score: 1

I think the current filesystem paradigm is due for replacement (I don't know who invented but it's probably *at least* 35 years old) but with a relational database? This is already an aging data storage model.

Storage is storage and there is no reason to differentiate between filesystem and database. But we could use an object store. Or an object-oriented database if that's any different. Build in document management / version control.
CVS as your file system? There are so many possibilities if you are prepared to rewrite this sub-system anyway...

--
Never trust a man in a blue trench coat, Never drive a car when you're dead

You Geeks by robhall · 2003-09-05 01:36 · Score: 2, Interesting

Everyone is everyon is looking at this from the Geek perspective. I teach computer technology to absolute beginners and the file system is the MOST confusing aspect. Most people have trouble with it and some people will never get it. If a database driven file system had a very simple interface (i.e. text searching that had fuzzy logic so that misspellings were okay) it would be GREAT for 90% of the population.

Should examine SHORE by teambpsi · 2003-09-05 01:38 · Score: 1

http://www.cs.wisc.edu/shore/

its Value Added Server architecture would lend itself quite nicely to this effort

--

Old age and treachery almost always overcome youth and skill.

Miror of screenshots.... by Daniel+Wood · 2003-09-05 01:38 · Score: 0

Site is starting to slow, so I figured I'de whore some karma and host some screen shots.

http://tentei.org/gargamel/

Oracel IFS by rhinoX · 2003-09-05 01:39 · Score: 4, Informative

It was called IFS and Oracle did it like, almost four years ago.

Versioning and various other metadata existed. It could be exported via SMB, NFS, FTP, and as a regular "local" windows filesystem.

And, why is this such a great big deal? I don't see the same stink raised as the possibility of Longhorn having a DB for a filesystem.

--
The copper bosses killed you, Joe. 'I never died', said he.

Re:Oracel IFS by Genady · 2003-09-05 02:25 · Score: 1

You left out WebDAV, though of course I can't make Nautallis hit iFS from Linux (we're on a REALLY old version of iFS)

--

What if it is just turtles all the way down?
Re:Oracel IFS by gotak · 2003-09-05 02:50 · Score: 1

Doesn't matter what oracle did. There stuff are expensive and difficult for most people to setup.

With the DB FS built into the OS programmers with new ideas can quickly try something out. This can result in some very neat applications.

So it would be a great thing if LINUX can have a optional easy to install DB FS or a efficent built in one. Either way if it's done quickly and well it can bring along a whole new species of applications.
Re:Oracel IFS by poofmeisterp · 2003-09-05 04:28 · Score: 1

I would imagine that's because not many people here WANT Longhorn and can't AFFORD Oracle IFS.
Re:Oracel IFS by cpeterso · 2003-09-05 09:02 · Score: 1

Microsoft does something --> evil!

GNOME/Linux does same thing later --> innotative!

--
cpeterso
Re:Oracel IFS by cerberusss · 2003-09-06 05:54 · Score: 1

It was called IFS and Oracle did it
It's now split. There's IFS, which is 'shrink-wrapped', and there's CM SDK (=Content Management Software Development Kit). That last one is basically a bunch of Java classes which form the interface to IFS. So for example, it's quite easy to create a 'listener' which does something when documents are created/changed/deleted. But you can do everything that's possible in IFS itself, like creating users and documents etc.
It would be pretty cool if the GNOME guys created an interface for this, too. That way, everyone could build (webbased?) systems that do stuff (send e-mail?) when documents appear, get changed or what else.

--
8 of 13 people found this answer helpful. Did you?

Screenshots misspelled? by kasparov · 2003-09-05 01:40 · Score: 1

Am I the only one that immediately thought that the screenshots were missing an "L" on the end? Wow, it just occured to me... I'm getting old.

--
There's no place I can be, since I found Serenity.

Re:About time-Set Theory. by varjag · 2003-09-05 01:43 · Score: 1

> How about a database filesystem based on set theory?

I am not quite sure what do you mean here. In a way, everything is based on set theory :)

And what kind of set-theoretic capabilities you want? Many of them can already be done on relational databases efficiently: intersection, union, etc.

--
Lisp is the Tengwar of programming languages.

Microsoft Attempts for decade,GNOME Does in months by NZheretic · 2003-09-05 01:45 · Score: 4, Interesting

1994 Cairo Takes OLE to New Levels

The next version of Windows NT, code-named Cairo and targeted for release sometime in 1995, will be built around the concepts of objects and component software. It will have a native OFS (Object File System) and distributed system support.

1995 Signs to Cairo

Cairo, Microsoft's object-oriented successor to Windows NT, will begin beta testing in early 1996 for release in 1997. Although Microsoft is not revealing the full details of Cairo yet, there are enough clues within current Microsoft OSes to yield a good idea of how it might work.

1996 Unearthing Cairo

At the first NT developers conference in 1992, Bill Gates announced that Cairo would arrive in three years and would incorporate object-oriented technologies, especially an object file system. Since then, we've seen Windows NT 3.1, NT 3.5, NT 3.51, and most recently NT 4.0. None is object oriented, none has an object file system, none is Cairo. It seems that Cairo is Microsoft's sly way of promising the world. "Will we see Plug and Play in NT?" "Oh yes, of course, in Cairo." "Will NT ever produce world peace and cheap antigravity?" "You bet -- in Cairo."

The so call Longhorn WinFS directory is just another rencarnation of the Cairo object orientated file system.

September 1, 2003 Eweek 'Longhorn' Rollout Slips

Microsoft Corp. has once again shifted the schedule for the release of "Longhorn," the company's next major version of Windows, leaving some users up in the air about an upgrade path.

Microsoft executives from Chairman and Chief Software Architect Bill Gates on down have long described Longhorn as the Redmond, Wash., company's most revolutionary operating system to date. The product was originally expected to ship next year. Then in May of this year, officials pushed back the release date to 2005. But now executives are declining to say when they expect the software to ship.
"We do not yet know the time frame for Longhorn, but it will involve a lot of innovative and exciting work," said Gates at a company financial analyst meeting this summer. Since then, other Microsoft officials have neither retracted nor clarified Gates' statement.

Microsoft have been attempting this type of functionality since 1991, over a decade. Meanwhile, one open source GNOME developer, with help from the other core GNOME developers, provides most of the features within months.

GREAT! If it is done well... by evilviper · 2003-09-05 01:45 · Score: 4, Interesting

People don't seem to see how great this is. Maybe it's because most people don't have all that much data.

On my home systems, I have over 250GB online. That doesn't even count my music or videos/movies, which I keep on seperate, removable, optical storage.

I can tell you from experience, that managing that much data is a huge hassle. Let's say you've got your files organized well. You probably have hundreds of folders for each subject, and you have to broswe to each one with each new file you save. I have a folder (several actually, for various subjects) where I save thing that I've haven't taken a look at yet. Let's say it's a program that I haven't installed. Well once I do install it, I need to clean up all the temporary files, then browse around to another folder (takes a minute or two when you have hundreds of folders), where I save installed programs, and browse to the appropriate sub-folder, and save it. But then I end up doing the same thing with a video clip... Watching it, deciding to delete or save it, then browsing to a sub-sub-sub-sub folder to move it.

Of course, that's enough of a hassle, but things get complicated when I want to move things to another systems, which obviously isn't going to have the same filesystem. Merging each individual folder, into each different folder is seriously time-consuming, and teedious. Without fail, there always ends up being a couple folders in the wrong place, because they were a sub-folder of something else, that I did happen to see when I coppied the contents of the folder.

Then matters are even further complicated, because I may choose to delete older content months later or so, and locating everything is a huge mess.

Personally, I would like to save everything in one place, not having to change folder to folder for each file. When saving something, I could just enter a handful of keywords (eg. "picture penguin snow") which would be much less work than moving to directories or even typing in a long filename. From there, a simple database system would be be able to know what type of file it is, how large it is, and how old it is. That would make it incredibly easy to manage. Whenever I want a file, I type-in "images older than 1 years" or "programs marked as archived" and I get EVERYTHING I'm looking for in a fraction of the time. Not only that, but it makes pruning out old data as easy as it could possibly be. Just search for "linux" and delete older version, no worries about what folder it's in... If it's in a temporary folder and you haven't used it yet, or if it's archived and been in-use on your system forever. Obviously you'll be able to see that information, but unlike in our current systems, it won't stand in your way when you want to find things.

It's absoultely no work at all to transfer files, since the info should stay with them, and it will automatically integrate perfectly with your local file management/organization scheme. What's more, data like marking something as "archived" is great in that your system could automatically move it over the network where you archive your files. Since your filesystem would be a smart database, when you search for the file, it could still turn up in the search results, and be automatically moved back where you need it, when you need it.

Personally, I think this would not only save time and effort, but money as well, because so many people wouldn't be dealing with their file problems by just throwing more space in their systems, instead of spending time on figuring out where every file is, what they can get rid of, dupilcate files, and junk like that.

With this, I should be able to say "tar -xjf 'newest version of mplayer'" However, this will need to be in the actual filesystem to be useful, not just supported for GNOME applications.

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant

Nope by varjag · 2003-09-05 01:49 · Score: 4, Insightful

> SQL is slow compared to things like BerkeleyDB

BerkeleyDB is a hierarchial database. SQL is godzillion times faster on complex searches.

> Your database becomes corrupt, you lose everything.

Your filesystem becomes corrupt, you lose everything.

And yeah, I know about journaling, so don't bother :) But modern RDBMSes have integrity control facilities as well.

--
Lisp is the Tengwar of programming languages.

Re:Nope by azaroth42 · 2003-09-05 02:14 · Score: 2, Interesting

> BerkeleyDB is a hierarchial database. SQL is
> godzillion times faster on complex searches.

Great, but who is going to often do complex enough searches for files that makes any sort of RDBMS worthwhile? The vast majority of searches would be simple keyed terms.
Re:Nope by Saint+Stephen · 2003-09-05 02:37 · Score: 4, Interesting

Just wait till you see the way "Pivots" work in the new Longhorn shell. The canonical example is sorting thousands of mp3s by artist, but it'll be A-FUCKIN-MAZIN.

Face it: databases rock. You never know how many interesting questions you didn't ask because you couldn't think in sets until you do it, and then it's FAST as all get out.
Re:Nope by varjag · 2003-09-05 02:51 · Score: 1

> Great, but who is going to often do complex enough searches for files that makes any sort of RDBMS worthwhile?

Try finding all executable files in your filesystem. Is such a query totally unreasonable?

--
Lisp is the Tengwar of programming languages.
Re:Nope by Anonymous Coward · 2003-09-05 03:00 · Score: 0

How often does the average user do that?

Like never?

I guess virus-killers *might* use it... but how many run those in the OSS world?
Re:Nope by varjag · 2003-09-05 03:07 · Score: 2, Interesting

> How often does the average user do that?
> Like never?

No, like, when he suspects his system is infected with trojan or worm and he wants to get the list of executable files installed in last five days.

--
Lisp is the Tengwar of programming languages.
Re:Nope by Anonymous Coward · 2003-09-05 03:21 · Score: 0

No.

The average user never suspects this.

They just blindly run the trojan for months, and in fact probably wouldn't know a trojan unless it popped up a "Trojan 1.0!" splash-screen at start up... and maybe even then...

The vast majority think the computer's a toaster. They won't do any complex searches.
Re:Nope by Anonymous Coward · 2003-09-05 03:41 · Score: 0

> Try finding all executable files in your filesystem. Is such a query totally unreasonable?

You don't need an RDBMS filesystem to do that. Even if you want to search for files changed within the last five days.
Re:Nope by ednopantz · 2003-09-05 04:02 · Score: 2, Insightful

Real world non-Geek use:

Sales rep needs the revised proposal on the Henderson account!

Computer: Get all Word documents emailed by Bob between last Thursday and today that contain the word 'Henderson' except where they also contain the phrase 'unreasonable, demanding client who is not worth our time'. --A snap for an db .
Re:Nope by nosferatu-man · 2003-09-05 04:52 · Score: 2, Insightful

Who cares about the user? The system itself would have capabilities finally expressible in terms more advanced than those constrained by what some drug-addled graduate student with no maths decided was sufficient in 1971.

In other words, the user might never think to do that, but it'd be so cheap for the operating environment that all kinds of new applications would appear.

'jfb

--
To spur "enterprise Linux," Big Bang, the distributed two-phase commit.
Re:Nope by GlassUser · 2003-09-05 04:59 · Score: 1

Yeah, but I can already do this through explorer with NTFS.

--
funny munging
Re:Nope by lp_bugman · 2003-09-05 05:12 · Score: 1

First of all any decent troyan will change it's metadata in the SQL system so It shows as an old file in your system.

Second the only part where I can se any point on having SQL as backend for a filesystem is for global searches over the hole system (ej. locate me all .mp3)

But then again if you are smart enouch you can put all your files in an order..
\Music\English\Madonna - Like Virgin\

The average and uses with no clue about comps. will just put all hes files in
\Documents and Settings\\My Documents

Then awain this fella won'thave that many files.

--
BSD licensed software can't be stolen....
Re:Nope by Anonymous Coward · 2003-09-05 06:10 · Score: 0

This makes me wish I had some mod points. To think that anyone would consider a userspace DB to be a security feature...*shudder*
Re:Nope by Brandybuck · 2003-09-05 06:37 · Score: 1

If my filesystem becomes corrupt, I can recover most of it. Heck, I might even be able to recover all of it. I think the core problem here is RDBMS versus The World. When you've been raised to fervently believe in your soul that RDBMS is the answer to everything, then everything else is a heresy. Heck, why not just get rid of the files while you're at it.

--
Don't blame me, I didn't vote for either of them!
Re:Nope by edwdig · 2003-09-05 06:50 · Score: 1

No, like, when he suspects his system is infected with trojan or worm and he wants to get the list of executable files installed in last five days.

The average user will give you a blank stare if you say something like that. They won't understand you at all, let alone be able to think of that themselves.
Re:Nope by Hatta · 2003-09-05 07:02 · Score: 1

find / -type f -perm +111

--
Give me Classic Slashdot or give me death!
Re:Nope by Hatta · 2003-09-05 07:12 · Score: 1

I would love it. It would let me easily queue female jazz singers from the 1960s. Or any show where Luther Dickenson guest starred. Or episodes of the simpsons where Lisa stars.

Sure it'll be a pain at first having to enter all that metadata. But if it gets common enough it could be integrated into something like db.etree.org or the masterlist at dapcentral and that would be a thing of true beauty.

Sure, I guess there might be some commercial applications too...

--
Give me Classic Slashdot or give me death!
Re:Nope by Anonymous Coward · 2003-09-05 07:13 · Score: 0

(Looks at watch)

Are we there yet?
Re:Nope by grungeKid · 2003-09-05 07:14 · Score: 1

Wouldn't newer trojans compromise the metadata as well, as to hide their tracks?
Re:Nope by Dave2+Wickham · 2003-09-05 07:26 · Score: 1

Do these people run Athlons?

Sorry! I won't do it again!
Re:Nope by rgmoore · 2003-09-05 07:50 · Score: 1

One of the articles on the storage web page mentions that it has some very cool built in features for dealing with movies. When you import a movie, it's able to extract enough information about it (like the title and release date) that it can then look it up in the Internet Movie Database and get a huge amount more (like the director, starring actors, etc.). From that point forward, you'll be able to search your movies based on all of the metadata that it dredged up from IMDB without you having to input it all yourself. If there is a similar knowledge base about other files of interest, it should be possible to create a similar system to gather and use information about them, too.

--
There's no point in questioning authority if you aren't going to listen to the answers.
Re:Nope by battjt · 2003-09-05 07:52 · Score: 1

A snap for an db .

Not unless you anticipate the question and build the indexes, otherwise it is just like a find/grep, a full text search.

Joe

--
Joe Batt Solid Design
Re:Nope by ednopantz · 2003-09-05 09:36 · Score: 1

You got me. I originally wrote "well-indexed db" in the preview, but I cut it out thinking that anyone who would build a fs as db would have to make certain it indexed in some rational way.

On another note, MS might be one of the few actors with the muscle to insist that anything that tries to save a "file" includes enough metadata to make it useful to the dbfs.

If they were smart, Office and Exchange would do this in the background, creating all the indexes and links with users none the wiser, all wrapped up in some nasty, obfuscated protocol that no one else can decipher, much less use without licensing some widget from MS. Nasty and obfuscated? Didn't I already say Exchange? (rimshot)
Re:Nope by Tony-A · 2003-09-05 11:51 · Score: 1

>Great, but who is going to often do complex enough searches for files that makes any sort of RDBMS worthwhile?
%gt;The vast majority of searches would be simple keyed terms.
The vast majority of searches are simple keyed terms because anything else would take far too long with current retrieval mechanisms.

Take 2 minutes to display a folder and a half second to retrieve the contents of the file you're after. I would much rather spend a half second to locate the file I'm after and two seconds to retrieve it.
Re:Nope by Tablizer · 2003-09-05 17:42 · Score: 1

Great, but who is going to often do complex enough searches for files that makes any sort of RDBMS worthwhile? The vast majority of searches would be simple keyed terms.

Look to the likes of Yahoo and Google if you want to see the future of "regular Joe" file searches. (True, their system is not really relational.)

--
Table-ized A.I.
Re:Nope by varjag · 2003-09-08 20:19 · Score: 1

> The average user will give you a blank stare if you say something like that.

OK, maybe it wasn't particularily good example. But, just because some dumbster doesn't needs an ability of doing fast complex searches, it doesn't makes them useless on a desktop PC. *I* and people of my environment would love this feature, and I don't give a flying fuck if someone's too unsophisticated to find uses for it.

An *average user* isn't making any meaningful use of his 3Ghz CPU either. Sould we dump them alltogether and resort to P133? They're adequate to run Word, after all!

--
Lisp is the Tengwar of programming languages.

Some thoughts.. by ShadeARG · 2003-09-05 01:51 · Score: 1

I can also see be performance penalties since you are now querying a database, rather than looking at a simple file structure...

Performance penalties will be negligible if not transparent in the near future. Remember, we are shooting for capability and not necessarily speed for present technology. Technology makes up for that further down the road.

This is just a thought, but it intrigues me. What about sockets (pipes and fifos possibly too) using this method? Imagine the firewalling possibilites alone. Wow.

Another stray thought. What about compressing the database records with Gzip on a PCI card?

Re:Some thoughts.. by Anonymous Coward · 2003-09-05 02:57 · Score: 0

Remember, we are shooting for capability and not necessarily speed for present technology.

I hate it when people do that.
Re:Some thoughts.. by Anonymous Coward · 2003-09-05 03:20 · Score: 0

If it was due to lazyness, then I would agree. A solution as featureful as SQL isn't exactly going to run well as part of a filesystem on older technology. Do you know of anything else that could offer as many possibilites with better performance?
Re:Some thoughts.. by Anonymous Coward · 2003-09-05 09:11 · Score: 0

No, I mean that I hate it when people try to squease in a 'featureful' solution into places where it's not needed (such as a filesystem...).

Besides, aiming for acceptable performance on tomorrows hardware only works as long as Moore's law keeps up. I find my self actually looking forward to that day, since we might get software that is fast on the hardware we own.

Re:AWESOME! by Anonymous Coward · 2003-09-05 01:53 · Score: 0

You are yet another brainwashed goon who actually believes Microsoft is innovative.

Name one thing Microsoft innovated that is revolutionary. No, I don't give a shit what they stole from some poor company and marketed as their own invention. Let's hear the hundreds of examples innovative ideas from Microsoft.

Ask a librarian how by nuggz · 2003-09-05 01:54 · Score: 2, Insightful

What we need is to get some information storage/retreival experts to provide some guidance to the developers of these ideas.

Librarians have been working on these problems for centuries, why not start with what they know?

Re:Ask a librarian how by russellh · 2003-09-05 03:17 · Score: 1

Librarians have been working on these problems for centuries, why not start with what they know?
Well for one thing, what they organize and classify are well-defined, unchanging units of information - books, movies, etc. Organizing by author, subject, etc. makes sense. I don't see anything like that situation on my hard drive.

--
must... stay... awake...
Re:Ask a librarian how by Dr.+Smeegee · 2003-09-05 03:42 · Score: 1

That is quite a good idea! Librarians are fast becoming database mavens anyway.
Re:Ask a librarian how by EarthTone · 2003-09-05 04:57 · Score: 1

Funny idea. I'm the IT manager at a library, and I gotta say that most of the librarians have no idea how to organize information on a computer. They rely on existing structure (existing records, MARC classification schemas). So, no, librarians probably aren't the best ones for the job. Newer fields, like Informatics, are probably better suited.

Re:Backups by Florian+Weimer · 2003-09-05 01:54 · Score: 1

There are still problems of this kind. The dependency tracking might eventually fix this, but only for new databases.

I wonder ..... by Anonymous Coward · 2003-09-05 02:02 · Score: 0

how new virusses will look like:

DELETE * FROM MY_FILES ??

Re:I wonder ..... by caluml · 2003-09-05 02:33 · Score: 1

DELETE FROM "MyDocu~1"
DELETE FROM "Progra~1"

--
Get your own free personal location tracker

BeOS is not quite dead by Anonymous Coward · 2003-09-05 02:03 · Score: 0

You can find it here and here.

Re:BeOS is not quite dead by Anonymous Coward · 2003-09-05 02:27 · Score: 0

What has Athene got to do with BeOS? I think you're thinking of AtheOS, which has been stagnent for two years and is really now really being developed as Syllable

Reinventing the wheel..? by Larsing · 2003-09-05 02:04 · Score: 1

Didn't someone do this for BSD ages ago?
Even think I read about it on /...

--
Ethics is what you say you do. Morals is what you actually do.

Printing a file becomes a lot harder, though... by Anonymous Coward · 2003-09-05 02:06 · Score: 0

#include "all headers relating to my current project" #include <all system headers> int main() { int InChar; FILE *InFile; InFile = fopen("that one file-- you know the one, right?", "r"); InChar = fgetc(InFile); while(InChar != EOF) { printf("%c", (char) InChar); InChar = fgetc(InFile); } fclose(InFile); }

Bowie by quinine · 2003-09-05 02:07 · Score: 1

Best quote:

P.S. Dear GNOME Hackers, in case you are getting nervous now... consider this: if Bowie said "don't jump off the cliff" would you jump just because he told you not to? ;-)

You know, we all laughed at the guy, but he was right all along about the rewindable desktop! Viva la Propaganda!

Re:Ahead of the game. by Anonymous Coward · 2003-09-05 02:07 · Score: 0

that's more innovative than they've been in the past

Not true, Microsoft copied the concept from IBM's AS400 series which had a DB2 based filesystem option back in 1997. I'm sure it was around ealier that but that was my first experience with it in a production environment. There is no Microsoft innovation here, move along.

Mirror of Images by Dougie · 2003-09-05 02:09 · Score: 0

There is a mirror of the images at

http://www.dark-hill.co.uk/~seth/storage/screent ho ts/

Dougie

--
Doug.

Re:Ahead of the game. by Anonymous Coward · 2003-09-05 02:13 · Score: 0

This is about as "insightfull" as every other "MS sucks" message that gets posted here.

how about storing all files at home (CDs, DVDs,..) by HTD · 2003-09-05 02:13 · Score: 1

this is a database after all. It would be cool to have this filesystem know all files that exist on my CDs, DVDs and remote filesystems (a.k.a. all files at home i can access in some way). This way i can figure out that file 'xyz.mp3' is on a CD named 'many albums' folder 'xyz'. this can be expanded to "import" the filesets of all my friends. The advantage would be that i can search for stuff that's not actually located on my pc but still know "where it is" and use metadata (id-tags, ...) to search for it. I can access the files later (by telling my friend to send it to me, by grabbing the CD or booting the remote machine...). What do you think about that idea?

Microsoft Article by ChaseTec · 2003-09-05 02:13 · Score: 1

Here

--
My Hello World is 512 bytes. But it's also a valid Fat12 boot sector, Fat12 file reader, and Pmode routine.

Not exactly by gilesjuk · 2003-09-05 02:13 · Score: 4, Informative

http://theregister.com/content/4/30670.html

"The oft-misunderstood Windows Future Storage (WinFS), which will include technology from the "Yukon" release of SQL Server, is not a file system," reports Thurrot. "Instead, WinFS is a service that runs on top of - and requires - NTFS."

Re:Not exactly by Directrix1 · 2003-09-05 02:23 · Score: 1

OK, back to the subject of the article. Does anyone else find this discovery too good to be true? Is this thing real?

--
Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
Re:Not exactly by gilesjuk · 2003-09-05 02:31 · Score: 1

Given all the other materials he's working on I would imagine it is for real. It's in alpha and there is a CVS server, but it will need quite a bit of testing I would imagine.

His PDF mentions it is based more around Natural Language then a typical database query language and also mentions it isn't supposed to replace the OS filesystem. So I would imagine you'd still keep your apps and OS on ext3, reiserfs etc.. but you would have another partition with Storage on it for all your work.

Ah yes, the infamous relational filesystem... by Millennium · 2003-09-05 02:16 · Score: 4, Interesting

Although this is an interesting idea, an all-relationsl filesystem would prove to be a usability nightmare.

The relational zealots are quick to point out that a relational system can model any sort of data. Indeed, it can do this. This does not, however, mean that it's always good at doing this. Sometimes it's the right tool for the job, and sometimes it's not. In this case, it is very much not a good tool for sole access to files on the system (though it can make an excellent tool for complementary methods of access).

The reason that hierarchical filesystems have survived for so long is due to one thing: navigability. It's relatively easy for any user to browse what's on the system and get a good idea of how it is organized.

You can't navigate a relational system, which will prove to be the downfall of any all-relational system which comes into being. You can, of course, do a SELECT * FROM volume if you really want to, but that does exactly that: it gives you all the data, with no particular organization. Examining the entire "sea of data" suddenly becomes cumbersome in the extreme. So while User A might be able to set up an all-relational filesystem completely according to his own tastes, User B will be totally lost on that same system. This is, to say the least, a nightmare for anyone working in a shared environment.

This is not to say that the relational model isn't necessarily a useful thing for filesystems. On the contrary, it can be very useful for certain kinds of searches. As time goes on, I believe we'll see more relational-style searching technology incorporated into file managers and search tools. However, there also needs to be a means of hierarchical navigation. Humans tend to think of things in terms of locus, and a means of providing that kind of reference point have to be maintained.

Luckily, this can actually still be emulated using relational-style tables, even though it's somewhat less efficient than classical storage techniques. Some filesystems already do something similar to this, and the results are promising. Look at Be's filesystem for an example of that.

The best way to go, moving forward, is something not unlike what BeOS did, with both hierarchical and relational methods of examining data. This allowed for the best of both worlds. The default method of getting at data is still the hierarchical paradigm, but relational searches can be applied to create what some have called "smart folders" (perhaps "boxes" might be a better term?) Systems like this "Storage" should be focusing on complementing traditional systems in this way, rather than replacing them.

Re:Ah yes, the infamous relational filesystem... by curious.corn · 2003-09-05 05:10 · Score: 1

SELECT * FROM volume; would return files nicely organized according to their filetype. Images, movies, documents, audio files...
Libraries? SELECT * FROM root WHERE type="library";
Executables? System files? Configuration files?
Trouble is techies meddling out of their home directory are too familiar with the hyerarchical vision to get out of it but the vast majority of the lusers sometimes just live within their desktop "folder".
I've played with the idea of an SQL/Relational filesystem 3 years ago and was slapped by all the bozos in love with /bin /usr/bin /usr/local/bin /usr/X11/bin /opt/bin ayeee!
You know? I'd love to see this in the kernel, in the VFS stack. I'd like to see my installed libs in one place, versioned and dependency checked at the filesystem level. You know, Apple patented 'piles' but that's just a general purpouse FS table: SELECT name, type FROM volume WHERE type='pile' AND name='pile_name';
$HOME is just SELECT name FROM volume WHERE username='foo'; a compat layer is just matter of choosing the directory names that would trigger these queries.
My hopes are on ReiserFS.

--
Mi domando chi Ã il mandante di tutte le cazzate che faccio - Altan
Re:Ah yes, the infamous relational filesystem... by Anonymous Coward · 2003-09-05 05:38 · Score: 0

Hierarchical is relational. It's just a one dimensioned relationship. There is no reason why, with a relational model you couldn't DISPLAY a hierarchical system. It's all just metadata.
Re:Ah yes, the infamous relational filesystem... by Anonymous Coward · 2003-09-05 06:27 · Score: 0

BS. Go download "dbvisualizer" and give it a try against your favourite database. Then come back here and tell us you cannot "navigate" a relational database easily.
Re:Ah yes, the infamous relational filesystem... by Anonymous Coward · 2003-09-05 07:23 · Score: 1, Informative

As others have pointed out, you can easily model a hierarchical system relationally, and browse it just like you do now.
But with the relational system, you could also browse it other ways. Want to implement Gelernter's Lifestreams? Just ignore the hierarchies and sort your entire filesystem by date.
You wouldn't want to force users to run sql queries, but you could easily implement more advanced views in your file manager...and make adhoc queries available for users who are up to it.
Re:Ah yes, the infamous relational filesystem... by PurplePhase · 2003-09-05 07:50 · Score: 1

The hierarchical portion would be a view on the relational portion (if I were designing it). If they want that view to be a View, I'm not yet sure how they would do that as a reliable, unique transformation.

8-PP
Re:Ah yes, the infamous relational filesystem... by Anonymous Coward · 2003-09-05 18:23 · Score: 0

Did anyone actually read the post they are responding to? There's now half a dozen posts pointing out that you can model a hierarchical system in the relational model. Guess what, that's bloody obvious and the original poster noted this in his second-to-last paragraph.

Reiser v4 by cryonic*angel · 2003-09-05 02:16 · Score: 1

Isn't ReiserFS going to address the same feature set, but from the other direction, with balanced B-Trees?

--
I knew then, knew utterly,
the deal done in my heart forever,
though how I knew not,
nor ever have.

What ever happened to KISS by mary_will_grow · 2003-09-05 02:18 · Score: 1

Keep it simple, stupid.

I wonder what the computer scientist to computer user ratio is. Too high.

--
Why stick up for big business?

Re:What ever happened to KISS by Anonymous Coward · 2003-09-05 02:51 · Score: 0

They got old and put their makeup back on.
Sorry. Had to be said.

Amazing idea...I am working by MagicBox · 2003-09-05 02:22 · Score: 1

on something that is based on that same idea. It's based on Windows blah(do not flame me pls) using SQL server. Our company has scanned all their paper work into .jpg files and I've stored them in a large HD. Right now we are at about 35GB worth of files, with a directory structure that's getting deeper and more difficult to deal with. When employees want to access some information, they have to go browse through directories and directories to get to that file. People are complaining that not only can they not locate files(some do not know how to), but it takes them up to 5 minutes to locate one file, while the customer is on hold. Now in a slow day it's ok, but when you have a lot of calls on queue it's not good. So what I did is I wrote a program that searches through all the directories and adds the path of the file to a database, adds the file name to another field and any other file info to a third field. A simple interface will let you search the DB and list any matching files, where you can click on it and open it up. Now is it easier to find 'John Doe 123456.jpg' buried amongst 30,000 directories or just type John Doe and get your file?? Windows search crashes or takes forever to iterate through all the direcories. What Gnome is doing would make the life of so many people so much easier. Probbably convert quite a few Closed Source customers to Open Source ones as well I am sure

--

The phaomnneil pweor of the hmuan mnid. Fcuknig amzanig eh!

Re:Amazing idea...I am working by codepunk · 2003-09-05 02:33 · Score: 1

Damn that is just genius, I wish I could have thought of such a thing. Course I would opt to not even piss around with building something to do this when I could have saved the company a ton of money and development time and just downloaded htdig.

--

Got Code?
Re:Amazing idea...I am working by Anonymous Coward · 2003-09-05 02:37 · Score: 0

Why are you wasting your time and your companies money? Just buy an AS/400 for which this feature has already been implemented.

Enjoy,
Re:Amazing idea...I am working by MagicBox · 2003-09-05 05:10 · Score: 1

Well first of all the management will not just spit $50,000.00 for an AS/400 just like that. Secondly, your statement Why are you wasting your time and your companies money is too blunt, when you do not even know the type of business, the infrastructure or anything else about the business. What you are telling me to do then is this: My Solution: Time it takes to build the search program: 5 working days Time it takes to built the database to store the info: 1 day Time it takes for the Interface, connectionts to the DB and testing: 5 working days Two weeks for everything to work. We already have the SQL server and the tools to buld the programs. Company money spent? None. Flexibility to use the program by as many people as needed: Unlimited. Change it and modify it according to company's business rules: Extremely easy. Other benefits: Yes. Your suggestion: Buy an AS/400 with all the storage capacity needed and the power needed: $50,000.00 Hire an AS/400 Admin: $50,000.00 + Unlimited Licences: I have no clue, but I'll say $5000.00(to leave it on the cheap side). Flexibility of the custom made program: Not even close. These are only a few of the downsides Summary: around $100,000.00 + unlimited headaches and not even near flexibility or power or ease of use or ease of maintenance as the custom written app. I think I'll stick with my solution.

--

The phaomnneil pweor of the hmuan mnid. Fcuknig amzanig eh!
Re:Amazing idea...I am working by MagicBox · 2003-09-05 05:12 · Score: 1

htdig?? Do you know one that works with Windows? Why is your solution going to "save" my company money? I mean I am open to suggestions, but I see yours as a useless rant rather than an actual solution.

--

The phaomnneil pweor of the hmuan mnid. Fcuknig amzanig eh!

Fix the GNOME file dialog first!! by Lispy · 2003-09-05 02:24 · Score: 1

and you will be set. Look, my Mom and most coworkers here never use a filemanager. They work in their home-directory on the Lan and open and save their file from whatever application they are using. Personally I found that most of them use Outlook as their "Filesystem" of choice since they can search an assosciate comments with files (aka attachements). If the Gnome FileDialog would be innovative and functional (I love GNOME, but this special part is a pita.) this could replace all Database/FS overlays. Its neat to search for files in plaintext but after all its just another Nautilus feature and not a whole new Filesystem approach.
Just my opinion...

cu,
Lispy

Re:Microsoft Attempts for decade,GNOME Does in mon by fault0 · 2003-09-05 02:25 · Score: 2, Insightful

Microsoft didn't really put in much investment in Cairo after it was pretty apparent that nobody really cared for it at the time. Most people really don't like novel ways of doing things. There is too much investment in the old ways atm. I guess if the world were different, we would all be using Microsoft Bob right now.

So, I think this GNOME thing will also sizzle out after a while.

someone tries this every five years by peter303 · 2003-09-05 02:25 · Score: 1

I've lost track of the names and attempts, but this has been tried a large number of times of times of the past decades. Its a good idea to integrate a DB into the OS. However, computer users seem rather conservation and stick to the old tried-and-true ideas. Look at how attached we are to the 36 year old Engelbart windows GUI, when many alternatives have been tried.

Re:Microsoft Attempts for decade,GNOME Does in mon by chicogeek · 2003-09-05 02:28 · Score: 1

Hmm, not quite super star. SQL Server is not an object oriented database, so I highly doubt that WinFS will be.

Re:AWESOME! by tomhudson · 2003-09-05 02:36 · Score: 1

poster wrote:

Name one thing Microsoft innovated that is revolutionary.

Security through obscurity?

What if MS has a patent on this? by Anonymous Coward · 2003-09-05 02:38 · Score: 0

shouldn't the Gnome people protect themselves now before we get into another "X corp. vs OSS" battle?

BTW: Notice I didn't use Caladera's "new name"

or simply because it's a chicken and egg problem? by Petronius · 2003-09-05 02:39 · Score: 1

a DB-backed filesystem is a genius idea until some asks: where should the database write its data files? ah, yes, the filesystem!

I think it's become the perpetual movement problem of the software industry.

--
there's no place like ~

Built for gnome? by adrianbaugh · 2003-09-05 02:39 · Score: 2, Insightful

Is it just me that sees this as a really bad idea? Nothing against the gnome project, you understand, but I see no earthly reason why a filesystem should require X Windows, let alone a full-blown desktop environment. Surely this kind of thing should be a kernel-level project which userspace tools can hook into as needed, whether from gnome or KDE or the CLI?

Anyway, I thought Reiser4 was doing exactly what this promises, but with the advantage of a proven track record on high-performance filesystems. Perhaps, if gnome wants this kind of functionality, they should base it on Reiser4 which will at least be widely-used and not locked into the gnome project.

--
"'I pass the test,' she said. 'I will diminish, and go into the West, and remain Galadriel.'"
- JRR Tolkien.

Re:or simply because it's a chicken and egg proble by Azghoul · 2003-09-05 02:46 · Score: 1

If the user never sees that part of the filesystem directly, does it still count as a "filesystem", per se?

Also, it should be possible to build this thing directly on to drive partitions, in the same way Oracle can. Do that, and it's really not a "filesystem" in the same way we normally think of one, is it?

Re:Won't compile :( by ajs318 · 2003-09-05 02:55 · Score: 1

It just means you've got a missing library. That is not as fatal as the error messages may lead you to believe. See, the error messages don't just say "it's broken" and the more messages you see, the worse broken it is. In fact, the messages actually tell you how to fix the problem, if you have achieved a sufficient degree of oneness with the hardware.

It was all going fine up to the point where we saw storage-item.c:7:44: libpq-fe.h: No such file or directory. Beyond there, the errors are going to mount up and up unavoidably.

In my experience, any package that wasn't put together by the distribution maintainers {and not a few that were} is likely to be a little problematic. Simply because it's not easy to remember what programme referenced what header files when you were compiling it. When the project reaches maturity and is released as a standard package, then the dependencies can be checked in earnest, so any library you need will be installed for you by the package manager; and probably everything that can be precompiled before building up the package will be. Only stuff that absolutely has to be compiled in-situ will be. In fact, many packages contain only binaries.

You haven't stated what distribution you're using, so I can't say for certain, but try to find a package called libpq-devel or something like that. In general, if there are two packages with the same name but one ends in -devel or -dev, that's the "extra stuff" likely to be needed by developers {and it will have a dependency to automatically install the other one anyway}. These packages contain the header files {*.h} for the pre-compiled binaries in the main package. {The .h forms part of the source code, and is needed when you are trying to link one programme to use functions declared in another programme.}

It took me a few evenings of hair-tearing to work this one out, but you just have to remember the computer hasn't got an initiative to use, so it relies on yours. Now I install the -devel version of anything just on general principle.

--
Je fume. Tu fumes. Nous fûmes!

Too much overhead? by mnmn · 2003-09-05 02:55 · Score: 1

The very first applications will be metadata of various mime file formats (not that everyone will follow it. Someone capturing a video to file will not have time to fill in all those blanks author title etc). The second application will be network access but that will be tricky. New permission fields including of course ACLs that include hosts as well as users and possibly md5 hashes etc will also be included.

Now with all the additional information.. ~10% of the file for smaller files, that adds to the space. The database engine adds to the overhead too. Does this mean we can no longer install RedHat on a Pentium 90MHz and use it as a firewall, mail server, file server, web server and other things in between? I know I wont be in a rush to use such a filesystem in a production server. The conceptual overhead is in itself too much. I'm used to the hier(1) streucture and sharing files using UNIX permissions through NFS and SMB. Took me 8 years to get really used to it all.

--
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky

going OT: library organizational schemes by mforbes · 2003-09-05 02:57 · Score: 2, Informative

An OT side-note:
Just as each of us has our own organizational scheme for our own bookshelves, libraries tend to vary more than we think too.

Just about every school and community library you'll find uses Dewey Decimal, of course, but others have other schemes.
For instance: the Library of Congress, in order to conserve space on their shelves, orders their books by size. (No, I'm not kidding. Look it up.) The directory is computerized, of course, so aside from the inconvenience of having same-topic volumes wildly separated in space, it's not a big deal for them.

--

Allegedly real newspaper headline from 1998:
Man Struck by Lightning Faces Battery Charge

Re:going OT: library organizational schemes by rgmoore · 2003-09-05 08:08 · Score: 1

For instance: the Library of Congress, in order to conserve space on their shelves, orders their books by size. (No, I'm not kidding. Look it up.) The directory is computerized, of course, so aside from the inconvenience of having same-topic volumes wildly separated in space, it's not a big deal for them.

And this works just fine because they have closed stacks, so the only people who have to worry about that particular detail are professionals who know how to find things there. All that you need to know when you're trying to get a book is enough information for them to find it in their computer, like the author and title, ISBN, or the like, and then they do the hard work of finding the physical volume. In many ways this is very similar to the idea being presented in Storage. You, the user, don't need to understand exactly how the file is categorized on the disk. Instead you only need to provide enough information to be able to specify the file- or a list of similar files that you can then look through to find the exact one you want- and the computer then does the heavy work of actually figuring out where the thing is.

--
There's no point in questioning authority if you aren't going to listen to the answers.

Abstraction layer already exists... by Lodragandraoidh · 2003-09-05 02:57 · Score: 1

Modern RDBMSs already have the ability to store blobs. They can also store pointers to external files. Why not keep your meta data in the database, and point to the files in the file system? This could be fully automated in your browser, for example, so you would 'check-in' information and files you want to keep track of. On a very simple level, you could register random scribblings, categorizing and commenting on them in ways that are impossible with current file systems (objects having multiple categories associated with them, categories and sub category relations that are not strictly trees, etc..)

I think we already have the tools, we just need some smart people to build the applications to leverage it (ok smart people! get to work!)

--

Lodragan Draoidh
The more you explain it, the more I don't understand it. - Mark Twain

SQL Server Desktop Engine by illsorted · 2003-09-05 03:01 · Score: 2, Informative

My guess is that they'll use MSDE, which is already freely available and "royalty free". I think it's basically just the core of SQL Server without any of the extra tools.

Re:SQL Server Desktop Engine by Anonymous Coward · 2003-09-05 07:48 · Score: 0

MSDE is NOT freely distributable unless you've already bought various Msoft developer tools.

Follow your own damn link and click on "How to Obtain"

I think we already have this... by TClevenger · 2003-09-05 03:02 · Score: 1

Windows 2000/XP users already have this. It's called "Indexing Service." Every file has a plethora of metadata fields you can fill in, and Indexing Service allows you to do complex searches on this data.

Okay, show of hands: who uses the Metadata fields? Anyone? Anybody even change their Excel username, or is it still the same "User" or "I.P. Freely" they used when they installed the software? Anyone? Thought not.

Re:Microsoft Attempts for decade,GNOME Does in mon by spektr · 2003-09-05 03:10 · Score: 2, Funny

Alarm bells ringing at Redmond: now they are even copying our vaporware! Push back the release dates, hire some programmers, we have to actually implement this stuff!

globals by leehwtsohg · 2003-09-05 03:14 · Score: 1

I don't agree.

It seems to me that this has the same benefit as having globals in programs.
I mean - globals are great: instead of passing a zillion of parameters to every subroutine, you just have a global called 'soup', and then if you change soup from 'chicken' to 'noodle' all subroutines that deal with the soup will immediatly know that the type changed.

The problem is that it all works nicely when everything works well. But if you try to debug a program, to see how soup got to be 'nou%^%$&*", things are much less nice.

Same with a global data structure instead of files. Imagine you'd have to figure out which program keeps moving your appointments to monday everytime the 29th of april arives.

Innovative? by Anonymous Coward · 2003-09-05 03:16 · Score: 0

Apparently the author is not familar with IBM mainframes/MVS/JCL and the various technologies.

Good to know the HIG guy is doing filesystems now by Anonymous Coward · 2003-09-05 03:16 · Score: 0

I've got this great study that shows users find it easier when file paths show the filename first, then the path.

Example: file.txt\username\home\

Users also prefer backslashes to forward slashes. This study will make Storage become a huge success like Gnome 2!

Finally by vjzuylen · 2003-09-05 03:17 · Score: 1

This might just be the most important development in computers since the dawn of the masturbation superhighway. Think about it; this will allow us to archive our pr0n collections like never before. Gone are the headaches of hierarchical filing conflicts. (Should lesbian threesome photoshoots be filed under /threesomes/ or /lesbians/? What if there's toys involved? Which fetish takes precedence?) Gone too, are the ever-deepening directory structures required to keep up with the diversification of smut. (At what point should separate directories be created for American and Japanese bukakke? After how many files/megabytes? Where to put the German variety of bukakke?)

My friends, we are entering a new golden sho^H^H^H age.

--

Hee-hee. Dying tickles!

-1: Slashdotism by oPless · 2003-09-05 03:17 · Score: 1

I for one welcome our new storage masters.

i'm working on something like this... by gimpboy · 2003-09-05 03:18 · Score: 1

it's webbased written in perl with everything stored in a postgres database. i found that with thousands of multimedia files (mp3's, avi's, images, etc.) it was impossible to find the stuff when i want it. now i can find an image from my 2001 trip to nebraska by issuing the following queries:

file_keywords: 2001 and file_keywords: nebraska and mime_type:image

some of this information can be extracted in an automated fashion (like mime types can be found throught file::magic), the rest of this information needs to be entered at some point, but once the files have been described it can find things pretty quickly. also i can group frequent queries together using lists. so the list:

images::vaction::nebraska can be associated with the above query making it easy to generate relevant hierarchies. so that some of the same images stored in image::vaction::nebraska can also be found in images::produce::corn.

if anyone is interested in using this send me an email. i'm about to put it back into the cvs. i've spent the last few months cleaning stuff up.

--
-- john

database based filing systems by ajs318 · 2003-09-05 03:20 · Score: 1

I was thinking of something similar myself. Only I would have done it with KDE as a frontend and MySQL as a backend. Or even right at the filing system level, with a bunch of extended attribute tags. Beyond that, it's the same thing. No confusing directory/file hierarchy; files arranged thematically. So every mp3 file would have its "audio" attribute set, for instance. Now when you want to look for mp3 files you don't have to remember "did I put it in /home/ajs318/files/music/mp3/levellers_the/hello_p ig/ or in /home/ajs318/songs/rock_pop_indie/levellers/albums /hello_pig/" kind of stuff. Because all files would be in ONE directory, and the mechanism for isolating you from files you aren't bothered about and would get in the way doesn't depend on putting them somewhere else.

I guess the real-life analogy would be a collection of clutter-resistant spectacles that blank out objects not of the type you want to find. So when looking for clean underwear, I need only put on some glasses through which I can see only items of clothing in order to narrow my search, rather than looking in a specific place for specific items. Hmm. Actually those would rock bells .....

--
Je fume. Tu fumes. Nous fûmes!

No, no, no!!! by master_p · 2003-09-05 03:25 · Score: 2, Interesting

No, this isn't what is needed. Hierarchical object-oriented persistent object trees is what is needed.

Let me explain.

Information in real life is organized in trees. It is obvious anywhere one can look. From the organizational chart of a company to the chair that you are sitting on, everything is a tree; a tree of information, where each little piece of information consists of other pieces of information.

If you check computer applications, almost all application contains some sort of tree. Take a Word document, for example: the master document, the contents, the heading 1 and subheading paragraphs, the pictures, the drawings. Everything is a tree, and the document can be browsed as a tree.

Take your favorite mail client. Information is organized in a tree: inbox, outbox, sent, trash; each of these contain an e-mail. An e-mail itself is composed of subject, body, attachments. The body consists of paragraphs; the attachment consists of files.

Take your favorite drawing program: the picture consists of layers; each layer consists of shapes; each shape may consist of other shapes.

Take your favorite 3d/cad program: a 3d object consists of other 3d objects.

Take the gui: a window consists of other windows.

Relational databases don't provide tree organization. I don't want a freaking flat table to store my documents. I want to organize them in trees. That's why a filesystem has subdirectories.

The biggest problem of the current filesystem technology, is that a 'file' is as dumb as it gets: it is just a collection of bytes, waiting to be manipulated by some other program. It is even untyped, for God's shake!!! one program may view it as a series of bytes, another program may view it as text, another program may view it as code!!! The file itself can't tell you anything about its properties, about its contents, about the way it is supposed to function....

If a file could tell the outside world how to be operated, then the world would be a much better place. If a file could tell me its properties, if it provided me with the tools to manipulate it, then I would make any type of app that processes the file as needed.

The above is essentially object orientation on the filesystem level. RDBMS don't offer such kind of functionality!!! at best, an RDMBS offers an index on a key for quick searching, and that's it!!! There is no notion of tree, nor each file exposes its properties/methods/functionality to its users!!!

So I say a big 'NO' to relational filesystems.We can immediately move to the upper level:

1) each node of information is AN OBJECT.

2) the object specification is defined at the filesystem level. Much like COM or .NET or CORBA.

3) each object can contain other objects, if it inherits and implements a specific interface.

4) each object is PERSISTENT. The filesystem takes care of persistence, according to attributes of the object's fields. Complex objects that are composed of other objects are also managed in the same way.

5) the parent object provides the storage implementation. The storage implementation would be object-oriented!!! An object could implement an RDBMS-like storage capability with indexes, keys, etc.

At each given time, the information model inside the computer could be:

1) splitted in multiple computers.

2) shared by multiple users.

3) checked for security in ONE place, inside the operating system.

4) provided as a framework to programming languages.

5) replicated across sites with minimum of code

6) a unified GUI could handle it

7) searching through it will become a breeze!!! (for example show me all MP3 with artist = Elvis and title = rock)

After all, 90% of the programming goes to load/store and display information. It is silly not to provide a unified mechanism for that. And a simple SQL-based RDBMS does not cut it.

Re:No, no, no!!! by Ranger+Rick · 2003-09-05 03:28 · Score: 1

How does this stop you from having metadata describing it's place in a hierarchy?

This gives the ability to treat the data as a superset of a hierarchical layout, not in lieu of.

--
WWJD? JWRTFM!!!
Re:No, no, no!!! by TheShadow · 2003-09-05 03:50 · Score: 1

I agree and have been thinking about such a system for a really really long time. I just haven't worked all the kinks out of my design yet... but when I do, I'll start coding that bad-boy.

--

--
"What do you want me to do? Whack a guy? Off a guy? Whack off a guy? Cause I'm married."
Re:No, no, no!!! by Cnik70 · 2003-09-05 04:16 · Score: 1

nice answer... someone please mod it up!

--
-Cnik
Re:No, no, no!!! by poofmeisterp · 2003-09-05 04:18 · Score: 1

I'm sure glad SOMEONE else gets it. *bangs head on desk*
Re:No, no, no!!! by Minna+Kirai · 2003-09-05 06:38 · Score: 1

Information in real life is organized in trees. It is obvious anywhere one can look.

Trollish. Sometimes by appealing to "obviousness", you can trick readers into not considering what you're saying.

Now the idea that Object-Orientation is useful is correct, but don't say that everything should be stored in trees. Look at the major OO languages- do they have strictly treelike class derivation, or is multiple-inheritance allowed?

From the organizational chart of a company to the chair that you are sitting on, everything is a tree;

No it's not. You could view everything as being in MULTIPLE trees, but not in one tree. The corporate organization is one tree for chain of command, another for geographical location, others for education/specialty/contract/gender...

Any attribute of objects which can answer a yes/no question is a way to assign it to a tree branch.

And a simple SQL-based RDBMS does not cut it.

That's why there are actually new projects going on. Nobody stated "Here's MySQL, go throw out your filesystem and try it instead"
Re:No, no, no!!! by master_p · 2003-09-05 10:57 · Score: 1

Trollish. Sometimes by appealing to obviousness, you can trick
readers into not considering what you're saying

Nope. You did not think well enough. Please read on.

Now the idea that Object-Orientation is useful is correct, but don't say
that everything should be stored in trees. Look at the major OO languages- do
they have strictly treelike class derivation, or is multiple-inheritance
allowed?

Multiple inheritance does not have to do anything with storage. Major OO
languages lack tree structures, but Java is full of trees and supports
model-view-controller nicely(the model in this case is the tree). Not to mention
that Java itself has the notion of trees embedded in it (the code is organized
in packages, in tree fashion!!!).

No it's not. You could view everything as being in MULTIPLE trees, but not
in one tree. The corporate organization is one tree for chain of command,
another for geographical location, others for
education/specialty/contract/gender...

Your perception ability is limited. Everything belongs to a tree. The
universe is the root node. Then we have galaxy clusters. Then galaxies. Then
solar systems. Then planets. Then the Earth. Then ground and sea. Then cities.
Then offices. Then your office. Then the papers on your office. Then little
paragraphs on each paper...each paragraph contains words...That's a nice example
of a tree. If you think about it, everything belongs to a tree.

And it is easy to countersay your comment, since a tree can exist into
another tree. Multiple trees can exist into other trees.

That's why there are actually new projects going on. Nobody stated
Here's MySQL, go throw out your filesystem and try it instead

But we should we go through the database step since it is so f*******
obvious that what we need is an object oriented information system ?
Re:No, no, no!!! by Minna+Kirai · 2003-09-06 09:57 · Score: 1

Your perception ability is limited. Everything belongs to a tree.

Your perception ability is limited if you think everything belongs to ONE TREE.

And it is easy to countersay your comment, since a tree can exist into
another tree. Multiple trees can exist into other trees.

If you think that, it explains why you are so wrong. From a strict Computer Science standpoint, the very definition of "tree" means that they cannot "exist into other trees".

I think you are confused by Anonymous Coward · 2003-09-05 03:26 · Score: 0

The relational model is used to express the relationships between different elements and types of data. There is no a priori reason for a purely relational filesystem to use a schema that is unintuitive to navigate over a schema that is intuitive to navigate.

The only thing to watch out for is the implementation. I'll gladly admit that a purely relational filesystem could be used to implement a schema that is utterly befuddling. Of course, I've also seen more than a few hierarchical file systems that were implemented in such a way that it is counterintuitive and painful to navigate through.

What the relational model really offers over the hierarchical model is flexibility. (A purely relational model would also offer proof of correctness, but I can't think of very many applications for a filesystem where queries and such need to be mathematically proven.) With a relational filesystem one can more easily simultaneously implement models in addition to the hierarchical model.

Re:or simply because it's a chicken and egg proble by codefool · 2003-09-05 03:31 · Score: 1

The filesystem would maintain the database on raw disk sectors and not in 'files' per se. Otherwise its just an abstraction layer employing a database engine to provide indexing services.

While this idea is interesting, I find the whole concept of indexing problematic, since the user is required to make all kinds of decisions about how to classify an object, which is not unlike having to decide upon a path name? The author gives in the paper the example music by U2. By magic numbers the filesystem should be able to identify a 'music' file, and by embedded headers a file with a group 'U2'. But this must necessarily be specified at object creation.

It seems to be trading one form of obfuscation for another...

--
"Stop whining!" - Arnold, as Mr. Kimble

Hmmmmmm. by SatanicPuppy · 2003-09-05 03:32 · Score: 1

If there was any way to make Windows less reliable, it would have to be coupling it with the beast that is MS SQL Server.

--
ad logicam Claiming a proposition is false because it was presented as the conclusion of a fallacious argument.

A Relational zealot responds by MagicMerlin · 2003-09-05 03:35 · Score: 1

I think you do not understand relational databases very well. You are confusing data organization with presentation.

create table folder
(
id int,
parent_id int
);

Hierarchal databases at best are very slightly better than R databases when asking for data in a preset implementation and much worse at everything else (ad hoc queries for example).

The problem with 'relational' databases are mostly due to limitations of SQL (especially recursive querying), not form relational organization. The 2 dimensional presentation of result data is an artifact of SQL, not relational technology.

Merlin

but what about my shell scripts? by mikeee · 2003-09-05 03:36 · Score: 1

So, can I have a shell script manipulating these files when I'm not logged in? No? No thanks.

What is needed, IMAO, is a tool to mount Gnome/KDE vfs 'filesystems' as, well, filesystems.

Words words words... by hugerobot · 2003-09-05 03:40 · Score: 1

Isn't it asking a lot to expect today's point and click users, who are used to clicking on folders in MS Explorer (or to a lesser degree Konqueror/Nautilus) to suddenly start typing strings of search criteria to retrieve files? For that matter, what in the world would this type of filesystem even look like in a file manager type window?

How about ReiserFS? by Pflipp · 2003-09-05 03:44 · Score: 1

They claim that the database be replaced by the filesystem.

Should we just put them in a room together and let 'em fight out, or anticipate for a middle-way solution (e.g. ReiserFS being the storage system for Storage) before there is no middle way anymore?

--
"We can confirm that Debian does *not* ship the version with the trojan horse. Our version predates it." [CA-2002-28]

community. by gimpboy · 2003-09-05 03:45 · Score: 1

find a bunch of people who are interested in sorting things like music, make it web based and give out accounts. it's what i did and it's worked pretty well.

--
-- john

True... But by overunderunderdone · 2003-09-05 03:48 · Score: 1

True users won't generally fill out a bunch of meta data. BUT, a lot of useful meta-data won't require the users input.
First off a lot of meta-data can be generated from the raw data itself or is already generated and present in the files but not available to the file system. For instance it might be nice to sort images by orientation, or by width, or total pixel size rather than by file size. With a little work most graphic apps could probably automatically add dominant color.

Second, data that is professionally produced for you would come with rich meta-data. If you bought your digital copy of E.T. legitimately it would know it's by Steven Spielberg. Tivo has the meta-data on the TV show you are recording. The PDF you downloaded from the IRS or from a corporate web site would also have good meta-data. Or just like CD's and the cddb you probably *can* get that info "from the file itself" (with a little help from an outside data source). I have yet to put a CD into my machine that iTunes didn't automagically know what it was and have fairly decent meta-data for.

Finally, some very useful meta-data could be added semi-automatically or forced on the user. For instance with a db based file system I may no longer be forced to put it into a "directory" (since there would be no such thing) but I may be forced to give it a minimal set of more useful information like what project(s) or client(s) it is associated with.

ST reference by zapp · 2003-09-05 03:50 · Score: 1

Combine this with some decent voice recognition...

"Computer, play something by Mozart"

"Computer, find emails sent to me by Tom, reguarding our semester final project" :) it's doable!

--
no comment

Re:or simply because it's a chicken and egg proble by kaisyain · 2003-09-05 03:56 · Score: 1

Why would it put them on the filesystem instead of on a partition of its own? I thought you could already configure Oracle to do this?

The important Q: How well does it work on porn? by zapp · 2003-09-05 03:59 · Score: 1

"Show any girl on girl action using dildos, but not strapons"

It has forever been a problem of mine to organize my collection when you don't have a name for half the girls. If only there were some sort of tool to make some sort of unique id based on non-changable qualities like position of moles, tatoos, or freckles.

--
no comment

Re:Mexican prostitutes by Anonymous Coward · 2003-09-05 04:01 · Score: 0

"Overrated." Lol. Sounds like some cracker with a pencil-dick got mod points this mornin and decided to use 'em in defense of his race and their scrawny little peters.

Good one, Chester. Just remember, while you're down in the basement stroking your keyboards, we're back behind the cotton house giving your wives and daughters the thick, veiny, powerful black cock they so desperately crave.

it will get there, though it might take time. by gimpboy · 2003-09-05 04:02 · Score: 1

If you'd had to type in all the metadata for your own CD collection, you never would have bothered. (Unless you're obsessive or something; most people aren't.)

i cannot argue with this, but when you get back from fiji you can eaily annotate the files from your camera with that information. then someday in the future, when you are browsing throught them and you see one with nice waterfall shot, you can update the metadata of that file. with time you would have the ability to find the needle in the haystack so to speak.

the community is a good way to do this also. a group of friends interested in photography, sharing the same database of files, would populate your metadata rather quickly.

--
-- john

Re:it will get there, though it might take time. by Anonymous Coward · 2003-09-05 04:34 · Score: 0

when you get back from fiji you can eaily annotate the files from your camera with that information

You're saying that wrong. The correct way to say it is this: if you want to store metadata, you HAVE TO annotate your files. Which is why metadata just isn't very useful. Annotation is hardly ever a justifiable use of time and effort.

a group of friends interested in photography, sharing the same database of files, would populate your metadata rather quickly

Huh? That's about the stupidest thing I've ever heard. First of all, people don't have "photography clubs," and if they did they certainly wouldn't share their pictures. But aside from that, you're still going to have to annotate every picture you take. Unreasonable.

Metadata, therefore, is all but useless.

why the relational model is not right by hansreiser · 2003-09-05 04:11 · Score: 4, Informative

www.namesys.com/whitepaper.html describes why the relational model is not the right one for large heterogeneous stores (filesystems), and describes the approach ReiserFS (a Linux filesystem used mostly in Europe) is taking instead.

Hans

Re:why the relational model is not right by MattRog · 2003-09-08 01:49 · Score: 1

That paper fails to explain exactly how the relational model fails. Remember, the relational model *is* set theoretic and it would be trivial to solve the 'Santa' problem as posed in the article in a truly RDBMS.

The author makes the fallacy of equating SQL with Relational, so I can see how he might think the relational model can't cut it.

--

Thanks,
--
Matt

This filesystem is far too complicated for most by joe_cisco_was_here · 2003-09-05 04:13 · Score: 1

The potential for problems is also too high. You cannot try to build intelligence into a filesystem yet. We simply are not there. Trying to search through a database is hard enough because results are not always accurate or in a timely manner. Until more progress is made with AI tree structured DB's are what the doctor ordered. AI must reach a point so it can interface with the loads of data we face everyday. This is why more development time is needed with AI.
Peace out.

--
"I wish everyone would stop quoting stupid nerd crap at the bottom of their signatures" --Curious George

Sounds conspicuously like by ttfkam · 2003-09-05 04:14 · Score: 1

the work done for ReiserFS.

It's nice that someone has made a GUI for it, but I wonder if this functionality truly belongs in a relational database rather than the filesystem level.

Nevertheless, the work done here for Gnome will make the functionality available for everyone with a decent relational database rather than just those with the next generation of filesystem handy (BeFS, WinFS, Reiser4, etc.).

--

- I don't need to go outside, my CRT tan'll do me just fine.

Re:GREAT! If it is done well... by poofmeisterp · 2003-09-05 04:15 · Score: 2

I can see it now... 'penis enlargement guaranteed' popping up at random places in the database.
You'll have to type "I do not want a bigger penis" to remove them all.
Heh.

Not just books by nuggz · 2003-09-05 04:16 · Score: 1

What about law libraries? Professional journals and pulished papers and articles?

Librarians do a bit more then stock shelves with books sorted by author.

Gnome-Storage vs. WinFS by IGnatius+T+Foobar · 2003-09-05 04:18 · Score: 1

I think this is a lousy idea regardless of whether it's the good guys or the Evil Empire implementing it... but, just like with Mono, I think it's good to have it around. Here's why:

In a couple of years, there are going to be Windows applications that want to make use of WinFS. They're all going to hook into some wacky new API that talks to an arbitrary new type of data store.

What happens when the good Linux folks (say, the WINE developers) want to implement that API? They've got to put the data somewhere. Perhaps they'll decide "ok, we'll write WINFSAPI.DLL as a shim between the Windows API and Gnome-Storage." It would sure beat doing a bunch of ugly hacks to the local filesystem.

--
Tired of FB/Google censorship? Visit UNCENSORED!

Re:Microsoft Attempts for decade,GNOME Does in mon by Anonymous Coward · 2003-09-05 04:18 · Score: 1, Interesting

You mean gnome steals an idea that MS has been working on for years (And never brought out because NOBODY wanted it)

Sounds pretty typically O.S. to me. Someone ELSE innovates, decides the time isn't right. O.S. steals the idea, comes out with it anyway, claims innovation. :(

One main problem with databases by lcsjk · 2003-09-05 04:21 · Score: 1

The user has to enter keywords to be searched on later. THe present system automatically adds directory, name, type (extension), date and time categories (keywords). Getting the user to always save with useful keywords (think 12 year-old boy) will be a problem that cannot be solved by the computer.

Taking the hard way... by windex82 · 2003-09-05 04:40 · Score: 1

any reason why using clear consice names and the operating systems built in file search tool wouldnt work?

Seems like a bunch of work to make things more complicated to me... I keep hearing about ppl with a couple thousand files and have a hard time finding things... ive got 7,000 (60g) songs ripped from my own cds and never have a problem finding any.. why? i used a clear, consistant naming scheme (artist - track number - track name), when i want to find a song, i click on the tools menu, then find file, and search for *keyword* from what im looking for, a few seconds later a list of everything with that keyword shows up.

It seems like everyone is taking the hard route, when all thats needed is to teach the users how to use the find 'feature' of their os.

I do realise there are some corperations with gigs upon gigs upon gigs of data that needs to be searched through, but from what ive gathers on this discussion the people wanting this are the people that pull all 200 pictures off their digital camera to a single directory and leave the original filename, (DSC389290.jpg) if you cant be bothered to change the filename to something readable what makes you think your gonna take the time to fill out the several info fields per picture? Do you think this solutions is going to be able to read your picture and know thats ted drunk at the company new years eve party? No your going to have to fill in some info about it to be able to search for it, in which case you might as well give it a consice filename and be done with it.

I also belive something new and revolutionary is needed in terms of filesystems to make things easier, but i dont feel that slapping a database on its back is the solution. Heck, at some points with really large data sets the db server is going to need to be handled on a seperate machine, talk about a problem waiting to happen.

man file by MenTaLguY · 2003-09-05 04:51 · Score: 1

see file(1). magic works.

--

DNA just wants to be free...

Voice recognition? by gr8_phk · 2003-09-05 04:58 · Score: 2, Interesting

If I'm to have a natural language interface to find my files, I'd really like to make spoken requests instead of typing a long sentence. Do they have plans for that in GNome?

I'd just like to have a mountable DB based volume by adrenaline_junky · 2003-09-05 05:00 · Score: 1

What would work for me would be a database that has all of the normal file attributes stored in it, and the directories created dynamically by SQL. Mounted and used transparently, this would be quite handy. Imagine a directory that magically contains all of your MP3 files, no matter what their home directory is. Hmmm... maybe I can set some people to work on this.

Middle-Aged Men: by Anonymous Coward · 2003-09-05 05:03 · Score: 0

'Storage' to Replace Traditional Six-pack Abs.

as long as it stays in user space by penguin7of9 · 2003-09-05 05:04 · Score: 3, Insightful

Of course, databases are very useful for organizing user data. People already keep PIM info, images, and lots of other stuff in databases. Lotus Notes is built entirely on databases.

But "replacing the traditional file system" carries with it the notion of ripping ext3 out of the kernel and putting a relational database there. That's a very bad idea. Databases don't belong into the kernel. They are far too inefficient to handle most storage needs, they are far too complex to go into the kernel, and they just don't need to be in the kernel. Operating system kernels need simple, fast storage systems. Something like ext3. ReiserFS is pushing the limits. PostgreSQL would be going too far.

As an aside, this is an idea that just about every nerd has when they learn about databases and retrieval. It's been tried various times since the 1960's. There are probably good reasons why interfaces don't use them. Perhaps most importantly, keep in mind that the vast majority of files on your system are not user files, they are bits and pieces of the operating system. And for the files that actually are used by users (mail, PIM info, images, text, etc.), they usually already have special-purpose database interfaces available to them as part of the applications that users use to access them.

Re:as long as it stays in user space by taradfong · 2003-09-05 06:34 · Score: 1

As an aside, this is an idea that just about every nerd has when they learn about databases and retrieval.

You have shattered my programming dreams forever. I once thought I was special. Here I find out everyone else has had the idea too.

Seriously, though, I have more confidence in this idea.

It's been tried various times since the 1960's. There are probably good reasons why interfaces don't use them.

In the 60's, you didn't have the resource headroom you do today. And you didn't have users who owned personal computers at all, much less those with thousands of music, video and text files all suffering under a 1-dimensional hierarchy plus the find command.

Perhaps most importantly, keep in mind that the vast majority of files on your system are not user files, they are bits and pieces of the operating system.

Speak for yourself! But I do entirely agree with you that the kernel is not the right place for a relational DB.

And for the files that actually are used by users (mail, PIM info, images, text, etc.), they usually already have special-purpose database interfaces available to them as part of the applications that users use to access them.

Ok, but there's a breakthrough waiting to happen here. Today, there's no way to automate interactions between these apps with their own db's. Microsoft's OLE, COM, DCOM, et. al have never quite gotten it right. Think of the power of plain text and unix tools. That gets shut out when you use a special purpose database. This DB filesystem opens that back up, letting me see relationships between my emails, documents, songs, etc.

--
Does it hurt to hear them lying? Was this the only world you had?

Re:Ahead of the game. by Trolling4Dollars · 2003-09-05 05:16 · Score: 1

With MS, you wont see anything until the whole product is done...

Funny that. I've been waiting for them to complete Outlook Epress for quite some time now... ;P

--
Un-news

no SQL Server 2003 by sirshannon · 2003-09-05 05:28 · Score: 1

It will be SQL Server 2004 if named by year.

--
The truth doesn't care what I think.

im confused by bmajik · 2003-09-05 05:37 · Score: 1

everyone has read the reports that microsoft will be shipping a database based filesystem with extensive metadata tracking in a future operating system. this has been essentially public knowledge for > 1 year.

some open source project comes along saying they're going to make a database based filesystem with extensive meta data. slashdot bozos call it "innovative"

the same slashdot bozos that say microsoft has _never_ innovated and has only "stolen" ideas from other sources.

so which is it ? is making the filesystem an rdbms with pervasive metadata innovative, or just a stolen idea ?

--
My opinions are my own, and do not necessarily represent those of my employer.

Re:im confused by Anonymous Coward · 2003-09-05 10:52 · Score: 0

The fallacy of the single slashdot poster strikes again. Wake up and smell the coffee, bmajik: there is more than one person posting to slashdot, and different people have different opinions! HTH. HAND.

Re:GREAT! If it is done well... by TrekCycling · 2003-09-05 05:38 · Score: 1

You need to simplify your life. Then you won't need a database file system.

Re:or simply because it's a chicken and egg proble by cens0r · 2003-09-05 05:39 · Score: 1

on 95% of the computers out there, 95% of the files were probably not created by the user. And the ones that were would very easily be able to have meta date. Imagine if you will ripping a CD to MP3. Almost any good software ripper can automatically identify the disc from CDDB, and fill in the ID3 tags. The same thing will work for movies. Most other files come from somewhere other than the user. PDF's, documents, emails, games, and applications are all stuff they've either downloaded (any corporation is going to start including meta data), or bought (don't you think the CD is going to include the data). The files most users create are documents and emails. Documents can easily have meta data forced into them. Instead of providing a file name and path, you give a subject. Emails have all the meta data information input automatically.

--
Jack Valenti and Orrin Hatch will be first up against the wall when the revolution comes.

Big deal by Anonymous Coward · 2003-09-05 05:44 · Score: 0

I've been using this concept in intranets I build for YEARS. It's called a SEARCH ENGINE. I find it stupid you people keep calling it a file or "storage" system. They're still FILES people... This is just another front end to GET to the files. GAWD.

Finally... by sahala · 2003-09-05 05:45 · Score: 1

...If this becomes prevalent among Linux distributions, it'll convince me to switch to Linux permanently.

Re:AWESOME! by Anonymous Coward · 2003-09-05 05:53 · Score: 0

Natural keyboard?

Only as good as your data by kstumpf · 2003-09-05 06:00 · Score: 4, Interesting

One thing I haven't seen mentioned yet is that a filesystem of this type is only useful if there is quality metadata accompanying every file you expect to find. Searching for "all jazz music" would return nothing unless the filesystem was told about each file that qualifies as "jazz music". What if I wanted to be more specific and say "jazz horn music"? Even more specific, "jazz trumpet solo"? The filesystem would have to know all of this data to be effective.

Where does this metadata come from? I assume I have to enter it myself. This means the more files I have, the more detailed and specific my data entry becomes. And that much more tedious.

Even worse is the uncertainty that would arise. Is my search for "horn solos" not returning results because there are no such files, or because the filesystem does not have meta data describing the files I want as such?

At this point, hierarchial organization once again becomes much more appealing again.

Re:Only as good as your data by Anonymous Coward · 2003-09-05 18:28 · Score: 0

The database system could probably tell you that it has no metadata describing horn solos. What it cannot tell you is that the horn solo you vaguely recall saving does not have complete metadata because you were in too much of a hurry to think up all possible descriptive words for it. But it can show you all the 1279 classical pieces with horn solos that you bought five years ago. Good luck looking for the file you want, since it's not even included in the 1279 search results...

Oracle has been doing this for years - with APIs by Anonymous Coward · 2003-09-05 06:00 · Score: 0

Content Management SDK (new IFS) - with JAVA API. Supported FS : AFP, FTP, SMB, NFS, WebDAV, etc. Plus Agent Support and Server-side overrides - hence when content is uploaded actions can automagically take place.

Oracle Files - bundled with there new Collaboration Suite - build on the CM SDK technology

And XML DB - free with the 9.2 database, allows storage of any type of content - WebDAV access, XPath queries etc.

Human Hierarchy Cause or Effect? by notcreative · 2003-09-05 06:10 · Score: 1

Some posts seem to be defending the hierarchical filesystem because "humans think in hiearchy." This seems a little specious.

Humans also think in terms of relationships. Does this mean that relational filesystems are "natural?"

I wonder if the hierarchy of filesystems is cause or effect. Are we hardwired for hierarchy, so we build machines that emulate it? Or did our cultural background happen to lead us to think hierarchically? I'm not just waving my hands; I guess I'd like to see some evidence that humans are "hierarchical" to a greater degree than they are "relational" in a state of nature. Eh?

So in summary... by sahala · 2003-09-05 06:20 · Score: 1

DB Filesystem naysayers:
No way we are going to change how we do things. We've been thinking in terms of files and directories all my life. We think in hierarchies -- it's intuitive! -- and everything we've learned and built revolves around them. Don't make me think...don't make me change!

DB Filesystem supporters :
We understand that it's a little complex to think of things in sets and relationships, but it really is better once you think in this way. Yes yes, it takes a little while to get used to, but with a little reading and practice it'll make your life easier.

Synchronization by Anonymous Coward · 2003-09-05 06:21 · Score: 0

You provide only an example of how metadata linked to files is helpfull and possible even without database filesystems, but you fail to give any good reason why not to use database filesystems instead. The problems of doing it your way are clear.

Logical Independence, Database vs. DBMS, etc. by MattRog · 2003-09-05 06:25 · Score: 1

Before starting we should define (courtesy of Dictionary.com, paraphrased) some terms since I've seen them misused all over the place on this topic.

Database: One or more large structured sets of data. A simple database might be a single file containing many records, each of which contains the same set of fields where each field is a certain fixed width.

Database Management System (DBMS): A program or set of programs that manage databases, which includes data integrity and security. Examples are Oracle, PostgreSQL, Sybase ASE, etc.

To those who are saying "It doesn't make sense to treat the filesystem as a DBMS!" I ask you: "What is a multi-user (Windows NT, Unix, etc.) operating system?" It's a set of programs (or one, if you only include the kernel) that manage your computer and truly a large part of the OS' job is to manage your data. It has facilities to edit data, maintain user permissions and concurrency, handle searches, etc. - in fact, it has many, if not all, of the aspects of a DBMS already! You might as well call an OS a Computer Management System.

However - it lacks one key feature of the Relational DBMS - namely physical data independence. Before I explain that, let me ask you a question:
What has the current hierarchical structure of file systems taught us?

Answer: It taught us that files and programs can only live in one directory and by extension of the directory idea (grouping like items) can only have one meaning. An MP3 described as /mp3/Adam_Kontras/Bread_in_the_Freezer.mp3 only tells me the name and the author's name but only if I know about the structure of the path. Can I take any MP3 and derive the author name by going up a directory? Not if I have /mp3/Adam_Kontras/4TVs/01-Intro.mp3!

Bread in the Freezer is a 'funny' song so if I wanted to I could put my funny authors in /mp3/humor/ (e.g. /mp3/humor/Adam_Kontras/...) but not all of Adam's songs are funny. So I could do /mp3/Adam_Kontras/humor/... but now I lose the ability to see the authors that have funny songs. I suppose I could create symlinks and sprinkle them around, but then I can easily lose track of where the original file is located.

So, not only can a particular file have a single meaning it is essentially an arbitrary one. I can't derive value from the path unless I have mapping schemes somewhere that tell me that MP3s are stored by /Author/Album/Track or /Author/Track or $OneTrueWay.

Most importantly: Why should I?

That's where the Relational Model comes into play. Why should I care where files are stored on my hard drive? I only care that the file exists. The relational model separates the logical from the physical and allows me to get to the heart of the issue: my data.

That's why the relational model makes perfect sense as a storage mechanism. I no longer need to know WHERE it lives. It could live in /usr/foo, or heck it could even live on a different server. Given a suitable front-end which can generate the queries (or views) I would never need to memorize a file path ever again - because it would be meaningless.

--

Thanks,
--
Matt

Relational table type solutions come with a price by samwhite_y · 2003-09-05 06:29 · Score: 2, Interesting

There are two types of ways that you usually access files.

1. I know precisely where one more more files are located and I want to see their contents as fast as possible. I want to move around 100s of such files easily from directory to directory at the potentially optimal speed of the disk subsystem. I might also want to do batch edits or renames of these files. Speed is the only truly important characteristic.

2. I vaguely have an idea of what type of file I am looking for. I want to find one or more files satisfying a particular metadata (or full text) criteria and manipulate one or more such files. I want all versions of my files to be maintained and there to be full auditing of interactions with these files.

Many times 1. and 2. are mutually incompatible. The typical way these days to address having both is to have an automatic spider that maintains an indexed view of the files and the file's metadata and hope that the spider is not too far behind the actual changes being made to the contents of the files or the locations of the files. If you want to have a transactionally guaranteed implementation of 2., then you have pretty much eliminated batch manipulation of files as a reasonably performing option. Database tracked file systems do not do well when you unzip a large collection of files and then start batch copying those files to different locations.

Now, I know almost nothing about the current implementation of the new "database" file system being discussed. But, I would very much want to allow a user to designate which directories or file types would be put into relational tables and which ones would not. I might also want to be able to choose whether the relational tables were interacted with using a transactional guarantee or whether a "spider" was used. If the end user had control over when the "heavier" management of the files occurred and how much of it should be applied, then it might have utility. Part of the user's file system would then be a document management library and other parts would be a normal file system. However, I would find creating a user interface that tried to make such a solution comprehensible to an end user somewhat of a nightmare.

Mediocre implementation of a needed concept by Animats · 2003-09-05 06:33 · Score: 1

Something needs to be done in this area, but this isn't it.

First, the "storage" guy is into "natural language processing" as a front end for a search engine. That never works, although it's been tried many, many times. Try AskJeeves. AI is still far too dumb.

If you want to try out something new in the file system area, consider this:

Data comes in two basic forms - big data items which have meaning only as an entity (database people call these a BLOB), and as database entries.
Big data objects with unique contents (and thus a unique MD5 hash) are called "datums". A datum can never change; you can create new ones and release old ones, but you can't change a datum. Images, video, and software are datums. Datums are identified by their MD5 hash.
Little data objects are stored in structured forms. These can be relations (as in relational DBMSs) or trees ("object oriented databases") Updates to these objects are handled by database-type atomic operations. Database transactions are atomic, commit and roll back cleanly, and are logged.
Indexing and control of datums is done via the database system. When no database item references a datum, the datum is released, using reference counts. In other words, the database is the directory system.

The key idea here is that "files" never get updated. Updating consists of either a well-defined database transaction or a total replacement. You never change a datum.

This is reasonable in the Unix/Linux world, which doesn't change files much. (Unix locking is so weak that shared access to files usually leads to trouble.) UCLA Locus (parts of which made it into AIX) had file semantics something like this - writes to a file allocated new pages, and when you closed the file, the new version "committed", and replaced the old one as an atomic operation.

Where stuff is changing in small increments, you have to have something like a database-like structure, and you may as well use a good, general-purpose one rather than some ad-hoc thing.

So that's another way to look at the problem. It's not that original; there are databases which offer such capabilities. But it's not usually thought of as a replacement for traditional file systems. Perhaps it should be.

Re:Microsoft Attempts for decade,GNOME Does in mon by gears5665 · 2003-09-05 06:50 · Score: 1

So, I think this GNOME thing will also sizzle out after a while.
Except that we're in a different time period now than we were then. Even 4 years ago, disk sizes were small and a filesystem was all that was needed.

Now, With 500GB drives and storage increasing at even more phenomenal rates we are using it for more and more data and that requires a different paradigm.

It all depends on the timing of the technology.

Utility by Brandybuck · 2003-09-05 06:51 · Score: 1

It's going to be part of the next generation of GNOME. But I hope to heck it won't be the standard of the next generation of GNOME.

Reading through these threads, it seems to me like there's two types of people. One type is hierarchically organized, the other disorganized completely. I know of no one is is organized relationally out in the non-computer world. Oh, I'm sure there's a few, but I've never met any.

So neither of the two types of people will get much utility out of this filesystem. The first type won't need it. And the second type won't organize their metadata to make it useful.

Thus, it shouldn't be the standard for the next generation of GNOME. Keep working on it. Make it available. But get it usable for the average bloke before you impose it on everyone.

--
Don't blame me, I didn't vote for either of them!

Single-level stores by alext · 2003-09-05 07:02 · Score: 1

Single-level stores are an important angle here, but they were around long before GEOS.

The original implementation, along with Virtual Memory itself, was in the Manchester Atlas (1962).

This led to implementations on the IBM 360/85 and then Multics. I used an implementation in Stratus VOS (Multics cousin) in '86.

What happens when... by Anonymous Coward · 2003-09-05 07:08 · Score: 0

...you realize that this file system didn't take off like you expected, isn't supported anymore, or doesn't fulfill your needs. Now you have tens of thousands files that don't live in a hierarchy, or whatever form is popular then.

Also by TheLink · 2003-09-05 07:14 · Score: 1

It's knowing what question to ask to get a better question to ask, to get a better question to ask, to finally get the answer.

Even if a filesystem has the answer, what question do you use? In that respect most DBs aren't very much higher than a normal filesystem. You can't ask them to give you a better question. They're just indexed/structured/ordered to give you answers to a certain or wider range of questions more rapidly.

It's late so I'm probably not making as much sense as I should be, but I assume that people would be able to figure it out :).

--

Too many replies beneath your current threshold

Definition of meta-data by alext · 2003-09-05 07:31 · Score: 1

It's logically incorrect (and perhaps rather pretentious) to refer to categorization information in general as "meta-data".

Only information essential to the interpretation of an object, such as its Latin-1 text encoding or a .C vs. .java file extension, could conceivably be classed as meta-data in an IT system.

The author and title of a song are simply categorization information of the kind that librarians have dealt with for a long time, and should be accorded no more special treatment by the storage system than any other explicit or implicit characteristic.

Re:Definition of meta-data by overunderunderdone · 2003-09-05 08:32 · Score: 1

It's logically incorrect (and perhaps rather pretentious) to refer to categorization information in general as "meta-data".

Only information essential to the interpretation of an object, such as its Latin-1 text encoding or a .C vs. .java file extension, could conceivably be classed as meta-data in an IT system.

Your definition is incorrect (and perhaps rather pretentious?). Meta-data is data about data. (Thus the prefix "meta" whose relavent translation is "about") There is no implication that the meta-data is essential or merely informative. So file type which is essential and modification date which is merely useful are both meta-data they are not the data itself but data about the data as is the author and title of a song. They are not the song they are information about the song - they are the songs meta-data.
meta-data /me't*-day`t*/, or combinations of /may'-/ or (Commonwealth) /mee'-/; /-dah`t*/ (Or "meta data") Data about data. In data processing, meta-data is definitional data that provides information about or documentation of other data managed within an application or environment.

For example, meta-data would document data about data elements or attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Meta-data may include descriptive information about the context, quality and condition, or characteristics of the data.
From Dictionary.com
Re:Definition of meta-data by alext · 2003-09-05 08:56 · Score: 1

A definition simply saying that meta-data is "data about data" would mean that any fact or statement qualifies as meta-data. "The sky is blue" is meta-data about skies and blueness at this level. Such a definition is meaningless.

The definition you offer is therefore partially "right" (i.e. useful) in that "data about data elements" could reasonably mean "describing how this to interpret this object as a set of specific data elements", e.g. a database schema.

However, the definition is quite wrong (i.e. inconsistent, not useful) in describing any inferred or derived information as meta-data.

Categorization is not an activity unique to IT, and it is absurd to adopt and debase mathematical terms when established alternatives exist.
Re:Definition of meta-data by overunderunderdone · 2003-09-06 02:44 · Score: 1

"The sky is blue" is meta-data about skies and blueness at this level. Such a definition is meaningless.

The sky is not data, nor for that matter is "blueness" data (unless you believe that The Matrix was a documentary rather than a work of fiction). "The sky is blue" therefore is not meta-data about the sky but simply data about the sky. If we are storing bits of data about the sky (ie "it's blue", "it has clouds" etc.) we may also want to store data about the data about the sky. If we also store the citation "by alext" and the date "sept. 5, 2003" that is not data about the sky, it is data about the data - it's "meta-data" in our sky-facts database.

Obviously there are some people (or at least you) that don't like that broad definition of meta-data, that think it is confusing or unhelpful. I will concede that this broad definition *can* occasionally be problematic - for instance if our database is not about sky-facts but is a database of quotations then the citation "by alext" would probably be considered to change in character from "meta-data" to data itself.

Ultimately despite that "problem" my usage is the common one, the one found in dictionaries and even accords with the latin terms that make up the word. Perhaps at one time your narrower definition was the common and accepted usage (I frankly wasn't aware of it) but in a living language things change. Usage dictates the definition of a word, "awful" is no longer a synonym of "awsome" no matter how correct that would be. "Meta-data" is now commonly accepted as a broad term meaning "data about data" and is subdivided from there into "essential meta-data", "intrinsic meta-data", even "categorization meta-data" and so on. I would argue the shift (if there was one) was like almost all such shifts - because the new usage *IS* useful. It's useful to have a broad term for "data about data" that distinguishes the primary data we are concerned with in any given database and the auxilary data about that data.

Can't we all just get along? by cfuent01 · 2003-09-05 07:37 · Score: 1

There seems to be a lack of middle ground opinions here. One side is for the benefits of a RDBMS based file system and the other is against it. Opinions are great, but it seems obvious to me that there are benefits to both and maybe the solution is to incorporate the best of both worlds into the next generation solution.

Working with large ERP software all day, I know the advantages of storing data in a RDBMS. Just think of the advantages of effective dating files in the OS. You can install a patch and have it take effect on a certain date instead of this instant. Or have the abiliy to restore older versions of files without resorting to a backup tape.

Obviously, I also experience the pitfalls of RDBMS all the time as well. Could you imagine what an invalid query to the file system would do to your network or file server?

A new solution will come sooner or later. It's obvious that we are starting to see a growing trend of new ideas in how file systems should work. More and more people (a small population to be sure) need something better than a heirarchical file system. This means that change is coming. I leave it up to the rest of you to make it work. Just don't screw up my computer games.

Re:GREAT! If it is done well... by Anonymous Coward · 2003-09-05 07:51 · Score: 0

Perhaps some of us do not need one :-)

Disadvantages by Anonymous Coward · 2003-09-05 07:53 · Score: 0

Despite the fact that the job of an operating system is to juggle the computer's CPU, memory, disk drive and other resources, some object-oriented programmers believe that a non-object-oriented operating system should care about the structure and organization of his data.
Huh? Do you think the kernel has nothing better to do?

When you ask the Linux kernel for memory you don't pass it a class definition, you just tell it how much memory you want. SIZEOF(whatever). Its not the operating systems job to cater to OO panty wastes. Don't get me wrong, I am one of those panty wastes but I don't expect the OS to either care about my particular predilection for objects nor cater to them. How about this for a kernel call, TellMeTheNameOfAllTheObjectsIveCreated();
The OS was graceful enough to go find some contigous storage, now shut the hell up and do some work for a change. Don't you have object libraries to do that for you? You want an SQL based filesystem to serialize and organize your objects? Where the hell are your foundation classes? Don't tell me that don't do that for you. Is object persistance not implemented correctly? Its not in a database? Just slice and dice some classes and it should be easy to do.

I also don't see any reason why other programs that don't particularly care about objects should have to eat a bunch of processing overhead in the filesystem when they don't even need those features.

I also think this sort of thing could get way out of hand. Much like the MS registry. If something goes wrong with the machine I can ususally get in there with a text editor and fix things. With an object database for a file system and all sorts of strange and variable object structures who knows if the damn thing is even readable. You would need an entire OO database front end just to fix it.

Unless you emulate a hierarchical unix file system on it and then the database features are just added extras. It would be better than having to use 'locate' and 'which' to scour the system.

Don't forget the the UNIX filesystem is also a holistic approach. Streams and other communications constructs are accessed through the same JBOB (just a bunch of bytes) model.

for whatever its worth.

Journalling? by mindstrm · 2003-09-05 08:09 · Score: 1

Journalling does not prevent corruption. Journalling prevents long filesystem checks on boot.

A journalled filesystem is no more or less prone to corruption than a non-journalled filesystem with an FSCK on boot.

Re:Journalling? by Anonymous Coward · 2003-09-05 10:33 · Score: 1, Informative

Journalling prevents corruption of the metadata if you only journal the metadata. If you also journal the data, then it prevents corruption of data also. On filesystems that allow data journalling, most people don't use it, complaining that it is slow.

Re:Microsoft Attempts for decade,GNOME Does in mon by fault0 · 2003-09-05 08:22 · Score: 1

> more and more data and that requires a different paradigm.

I don't think an user has many more amount of files than they did 1 year or 3 years ago. Instead, file sizes of documents are growing larger. I still remember the day when I used to download "various media files" encoded with QuickTime or RealPlayer that were terribly low quality but were very small sizes (20-40 mb)

Nowadays, 1.5gb SVCD's are becoming the norm, and people often post 8gb non-encoded DVD rips over things like bittorent. People are starting to use higher bitrates of things like mp3s.

As for things like documents, I would have to guess that the average user keeps around the same amount of documents in their harddrive as they have been for the last 3 years (pretty much since the downfall of the floppy and zip drives...)

basically, putting all of your shit on a harddrive is not any more common of a thing to do as it was 3 years ago, but certainly is more common than 6 years ago. The problem is that 3 years ago, things like storage would not have taken off, and since the situation has remain unchanged since then, they still won't =D

And IZE already did this eons ago... by kupci · 2003-09-05 08:22 · Score: 2, Interesting

IZE was written by Persoft Corp., 465 Science Drive, Madison, Wisconsin, 53711

This was one of those revolutionary products that never really took off for some reason. Text -based word processor , but this was pre-Windows (perhaps that was the problem).

From what I remember, you could create a document, and then save it based with keywords. It was really aimed at writers and was a great way to organize a bunch of documents, create outlines etc. Sort of like having an electronic file card system. Very very cool - the Windows Explorer is almost primitive by comparison. Could've easily extended it to support any type of file.

It's a great idea nice to see GNOME pick it up.

Re:And IZE already did this eons ago... by Anonymous Coward · 2003-09-05 09:44 · Score: 0

Yeah, there were a number of interesting data organization apps for DOS that were swept under the rug when Windows came out.

Part of the problem was that DOS was single tasking and had a very poor-level of integration between apps -- what good is a catalog system if one has to use their wordprocessor and no other?

The other big reason is that virtually none of these apps were network/multi-user aware, and died out after "personal computing" moved into a more corporate IT mode.

One 80s product that dealt with both issues was Lotus Notes -- you have a database-based "store" with fast indexing and searching and it was sorta GUI-based. Again, however, it didn't (doesn't) integrate well with anything except itself and Lotus SmartSuite, which makes it very difficult to get metadata in or out of the system.

When you look at what can be done with products like Notes 15 years ago, versus the very lame level of organization technology built into most desktops today, it's very sad. However, rather than declaring the idea a failure, the right way forward is to integrate it into the system at a much lower level.

nice idea, but... by the_greywolf · 2003-09-05 08:41 · Score: 1

i was briefly on an Operating System project a while back (LinksOS) before the dev team split, and one of the ideas for the system was a natural-language file query system. very similar to storage, in fact. i wouldnt be surprised if one of the guys on the original Links team has something to do with this.

but from the beginning, i thought it was an unnecessary thing. the average user is more accustomd to clicking his way to a program than typing "run myprogram" on a box. likewise, they're more accustomed to finding files exclusively by name - "find files and folders" named "my file".

i still think it's unnecessary. it seems like too much bloat for the operating system for a feature that will become rather painful for some to use. as others have noted, the metadata needs to be maintained properly, and a database system might not be fast enough for the likes of some people.

IMO, the best option for something like this is incorporating a natural query language to a filesystem module that operates a filesystem not unlike the Be File System.

the thing i like about it is the fact that each file has an unlimited number of arbitrary attributes that are indexed in a "hidden" index directory, which can be queried with a boolean syntax. all of that was in the filesystem driver, and it still is one of the most trim and efficient filesystems to date. its only real overhead is the hard drive space used by the indexes.

now to be honest, i do like the natural language query thing. but FWIW, keeping unix-like heierarchies (sp? :P) is MUCH better than trying to maintain a database.

i mean, Apple tried that with HFS, and look what happened! it was so limited, crippled, and slow that even HFS+ couldn't make up for it. it just couldn't SCALE.

--
grey wolf
LET FORTRAN DIE!

Re:nice idea, but... by karlbowden · 2003-09-05 11:21 · Score: 1

I loved the BeFS too. The arbitery attributes made it easy to search through any type of file, just like we search through mp3's based of id3 tags.

There are kernel patches available to add this suppport to ext2, ext3, and reiser iirc. Adding support for this into nautilus would go a lot further, in a littler time imo. It would though, take a long time too adapat programs to support the extra information. Eg, gthumb support for comments on images etc..

I also liked the download managed in BeFS, it too stored extra attributes about the file being downloaded, and displayed 'status icons' (directly in the file browsing window) that had a little graph of how much had been downloaded.

Anyway, just my ramblings.

Re:GREAT! If it is done well... by the_greywolf · 2003-09-05 08:51 · Score: 1

actually, take a look at the BeFS in the BeOS. it does pretty much everything you're talking about to at least some extent.

it would be nice to take the ideas that BeFS ran with and expand ReiserFS a few generations ahead that way.

--
grey wolf
LET FORTRAN DIE!

What about a virus that does... by Anonymous Coward · 2003-09-05 09:01 · Score: 0

..... Drop Table "/"

The meme that got away by podperson · 2003-09-05 09:24 · Score: 1

The idea was probably stolen from Xerox Parc in the first place, of course.

The mouse and many fundamental GUI concepts were invented at UC Berkeley by Douglas Englebart. Xerox added non-overlapping windows, buttons, icons. Apple added dragging, overlapping windows, and many other concepts we take for granted now while improving on ideas they stole. It seems reasonable to accord Apple with the credit for inventing the modern GUI along with Xerox Parc and Englebart. But the big aha goes to Englebart, not Xerox or Apple.

"Anyone can see a wheel and come up with the car. It takes a good science fiction writer to see a wheel and come up with traffic jams and parking tickets." (Source forgotten.)

A much simpler way is to by Anonymous Coward · 2003-09-05 09:44 · Score: 0

allow your entire hard drive ( / on down) to be accessed through http and let google index it for you

You said a bad word... by Styx · 2003-09-05 09:56 · Score: 1

Aaaaaaaargh! How can you mention netinfo? IMHO the most drain-bramaged thing to come out of NeXT. Even Apple has seen the light and is moving to LDAP (LDAP has its weak points too, but not as bad as netinfo).

--
/Styx

Sounds like a text adventure by Anonymous Coward · 2003-09-05 10:10 · Score: 0

Now that would be a cool filesystem.

> Show me some porn

You don't see any porn here

> Open the secret folder my girlfriend can't see

Opening the folder reveals some porn

> Open the porn

I'm not sure what you want to do with the porn

> Display the porn

You can't do that with the porn

> Play the porn

Are you saying you want to play with the porn?

> Read the porn

There's not much to read. It's mosty pictures of naked women. Boy do they look hot!

> Show the pictures

I don't see any pictures here

----------------
Hmmmm.... maybe not so great an idea after all.

powertoys by horace · 2003-09-05 10:21 · Score: 1

Copy that!

O....K..... by Anonymous Coward · 2003-09-05 10:22 · Score: 0

What's next, the GNOME kernel?

They've already said it, but I'll say it again.. by NekoXP · 2003-09-05 10:32 · Score: 1

It's exactly the same featureset as WinFS (as in Longhorn)

Now, whenever anyone says "I hate MS, they have that fucking shitty WinFS!!!!", we can point them at the GNOME project which is doing precisely the same thing, and they can eat that humble pie right up with their hat.

Why so cynical? by Spy+Hunter · 2003-09-05 10:38 · Score: 3, Interesting

Storage is more advanced, at least in concept, than all of these other options. That is why it is more interesting. The first interesting thing about Storage is that it uses natural language parsing instead of a predefined query language. This is essential to wide acceptance. The second interesting thing about Storage is that it goes out of its way to find and catalog useful metadata, whereas in most other systems you must input the metadata yourself (a tedious task that no one likes). An example given is using the name and length of a Divx file to search IMDB to get all the relevant information about a film. In this way, Storage solves what I see as the two main problems with database filesystems (it remains to be seen how well it works in practice, however). A third interesting thing about Storage is that it is backwards compatible with all GNOME applications through the VFS layer. A kioslave could allow it to work with KDE too.

--
main(c,r){for(r=32;r;) printf(++c>31?c=!r--,"\n":c<r?" ":~c&r?" `":" #");}

Reiser4 by SanityInAnarchy · 2003-09-05 10:47 · Score: 1

I still like the idea of Reiser4 better. Mostly for the plugins, but also because I've had good experiences with Reiser3.

--
Don't thank God, thank a doctor!

Googling by safiire · 2003-09-05 10:50 · Score: 0

Yes, next lets give up on URLs entirely too. When you want your friend to see a page you just found, why bother telling him the URL when you can just tell him to google for the 'really cool mostly purple page about penguins'

A hybrid model would probably be best by shadow_slicer · 2003-09-05 11:24 · Score: 1

I relational system doesn't need to be like that.

What if it were more closely integrated into the filesystem so that the different metadata categories could be navigated as if they were directories, with files listed multiple places in the tree (wherever they fit in. Instead of doing a clumsy select * from volume, you could just browse the categories and subcategories (which could be dynamically generated or specified--or some combination of the two)

I already use something like this for my anime (I have a bash script that indexes everything and uses symlinks to list all my files). I have them by in categories of series name, fansubbing group, and whether I have burned them off onto cd yet.

I don't see anything wrong with replacing the traditional filesystem with something more relational as long as it's done intelligently.

Other posts I read claimed that the heirarchial (sp?) filesystem model will last forever because people think hierarchially. They gave an example of 1 person telling the other where to find something. They claimed that the person described the location hierarchially: at Standford, in dorm room ##, on the desk, next to the computer.

Really that description isn't really that hierarchial. It's more of a meld of hierarchial and relational. First of people may describe that position in different ways: instead of beside the computer it may be "next to my stack of warez cds" or 5m north-west of the room entrance. In addition the description is in relation to other objects--"next to my computer", "on top of my desk"

Really this is just the intersection of that metadata.

What is really needed for relational systems of data storage to take off is a better way to view the data (I like my filesystem idea personally). Until then there's no way this will ever be widely used. But when it does it will be one of those "how did I ever do anything without this" things..

Apply commands to a selection! by Clod9 · 2003-09-05 12:18 · Score: 1

Whatever else they do, I hope they make a way for me to select a group of icons visually (i.e. with the mouse) and then operate on them with a series of shell commands, and use globbing in GUI tools (like selecting multiple files in a file-selection dialog).

Re:GREAT! If it is done well... by evilviper · 2003-09-05 12:18 · Score: 1

I'm thinking, after I make my first million, I'll probably just retire to some private, tropical island, and leave everything electrically and chemically powered, behind.

Until that day comes, anything that makes things easier and quicker is most welcome.

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant

Comments from Seth (aka Storage's designer) by nullity · 2003-09-05 12:25 · Score: 5, Informative

I suppose it is probably too late to inject comments and have them moderated to the point of visibility as the madness has largely subsided... but here's to futile acts ;-) I was not really intending Storage to make a big splash right now, I wanted to keep it low-key, but I guess the damage is done so I might as well comment. I'm sorry that I didn't have time to put up a more technically-oriented exposition of Storage. *shrug*

Slashdot has focused almost exclusively on the "database backing". Guys, this is an implementation detail. Its an important one, but I didn't start off this design thinking "lets write a database backed filesystem store". A set of design goals was established (largely mirrored in the features page). Storage is a lot more than just a database backed XML store. Please read the features page. The "searchable" stuff is nice, but equally important is providing persistent objects, uniform access (the same URI for a local storage node works globally assuming your computer has a publicly accessible IP address), an improved model for revision and "saving", the ability to localize filesystem resources, and due to a standard object format greater transparency of filesystem resources to the OS which will be useful in weakening the barrier between "apps" and "desktop" found in PCs (and not so much in, say, cell phones and pdas). This is also a key piece in an overall design of the desktop's interaction structure which I haven't had time to write up for the web.
I'm not trying to make any claims to being the first or being highly innovative, but I am happy to make claims about improving the user experience. That said, contrary to what people are saying, to my knowledge other than the superficial layer of database backing, Storage's features do not have a "one to one" correspondence with any existing system, BFS and the only vaguely specified Windows Future Filesystem included. Most importantly these components do not seem to be a part of the same overall interaction design model that Storage is intended to support. Storage is just a stepping stone, albeit a pretty disruptive one.
I've been quiet about this project, even inside GNOME. Storage as written today was primarily written by a team of Stanford students as their CS senior project. I've since been working with a few good GNOME developers including the person working on Medusa (Curtis) and the Epiphany maintainer (Marco). They were independently developing a metadata system for GNOME, which it looks like we may implement on top of Storage as a first major test of its capabilities. But nothing is certain right now. But the short story is that although storage is being developed by GNOME developers and I serve as usability project lead, its not an official GNOME module at this point. GNOME developers would need to corporately buy into both the Storage vision and the overall desktop design. This may never happen, and if it does, its going to be very slow in the coming.

Some technical notes... that site is sparse on technical information so I'll fill in some for the curious.

The data store is backed by Postgresql. Postgresql rocks, though some of the features like instant notification of object changes and live queries do not fill well with existing SQL. We have ways to do all of this using Postgresql extensions, but sometimes its a little tricky and/or hackish.
A lot of the proposed interface will rise and fall based on the quality of the NL processing. Storage is currently using some pretty cutting edge linguistics theories and tools... notably working within the basic LinGo framework. This includes using theories/systems like HPSG (Head-Phrase Structure Grammar), MRS (minimal recursion semantics), and being able to use a set of existing wide-coverage grammars such as the ERG (English Resource Gramm

Re:Comments from Seth (aka Storage's designer) by smallpaul · 2003-09-05 13:59 · Score: 1

I am very enthusiastic about the fact that you are combining the global and local addressing schemes. I find the impedence mismatch between remote, URL-addressable objects and local, path-addressable objects to be quite unfortunate. Of course you can use file:// URLs but then they are system specific. A group of files connected using file:// URLs cannot even be reliably shared on a network share because the mount points might be different!
Re:Comments from Seth (aka Storage's designer) by mattr · 2003-09-06 01:18 · Score: 1

This is fabulous work! As it happens I've been looking for good NLP tools for a while now and wading through the concepts in computational linguistics.. but you see while interested in it I am not a computational linguist, having only absorbed enough to be able to read most of Ulrich Callmeier's thesis which I managed to find though I probably don't get the jokes..
It strikes me that your implementation of these systems in the real world is of an immense help to people who would like to mess around with these linguistic tools but don't have a degree in the field. It would be very interesting if you would consider separating your distillation of language processing algorithms and code into a module which could be used by other applications? Presumably (though it a trivial system in comparison) you are already considering this vis a vis the RDF store you mentioned to?
The intersection I think came to mind at a meeting I went to last night for blog-related developers in Tokyo. A developer next to me which morphological analyzer I used (chasen and kakasi) which is an important algorithm for a Japanese search engine but it blew my mind that it was on *his* mind! Anyway recently I have been doing more research in this area to determine what is possible for an upcoming project. I sense the timing is absolutely right for this stuff.
One question I have, will you attempt to automatically do document clustering and try to determine the names of new document categories based on the text?
Well it seems that with your mind, it would be a quantum leap for linux if you provided access to these tweaked and digested algorithms. (Which can also be seen as carrying on in the tradition of Ullrich's integration work, with a dose of open source and real world issues.) And it would be nice to see a section on your site about the tools you use. So I guess this is asking about whether your computational linguistics work will remain buried in your system or be provided as a general-purpose parsing/analysis service. Oh, I also just added a whole set of links to my NLP bookmark list after reading your posts. Thank you and good luck!
Re:Comments from Seth (aka Storage's designer) by Anonymous Coward · 2003-09-08 07:54 · Score: 0

This thing is trying to fix the lack of a file manager in GNOME (GNOME does not have a file manager: it has a stupid file browser that does some file managing, but this is not its main intent).

Instead of letting users manage their files, this "cool" thing will stupidomatically manage the files for the stupid users. I'm a stupid for still using and developing GNOME. This sucks.

GNOME developers prefer to create new problems instead of fixing current ones. If GNOME had a simple file manager, Storage would be worthless.

But we must accept one more stupid feature of GNOME. After all, this guy is the "god of usability". Shit.

Re:or simply because it's a chicken and egg proble by Daniel+Phillips · 2003-09-05 13:21 · Score: 1

a DB-backed filesystem is a genius idea until some asks: where should the database write its data files? ah, yes, the filesystem!

While that is true of most simple database systems, big iron DBs like Oracle typically work directly with raw volumes.

--
Have you got your LWN subscription yet?

Re:BeFS - further BeFS reading by The_Blind_Priest · 2003-09-05 15:26 · Score: 2, Informative

If anyone would like a nice read on filesystem implementation and/or Dominic's approach to the fs redesign, check out:

Practical File System Design with the Be File System
by Dominic Giampaolo, Be Inc, Dimonic Giampaolo

1. Nice overview of various filesystem's in use.
2. Quick and to the point.
3. Enough detail to go about rolling one up yourself.
4. Being written by Dominic it provides nice BeFS insights.

Comments from Seth-Semantic Web by Anonymous Coward · 2003-09-05 15:27 · Score: 0

"We are discussing making storage an RDF store instead of what is now vaguely an XML store (with concessions to handling binary data). If you are experienced with RDF and related standards, we would be interested in your help."

How come you're not using XML Topic Maps?

A filesystem *IS* a database! by Tracy+Reed · 2003-09-05 18:10 · Score: 2, Insightful

Not really a comment on "storage" but just a comment on something that has constantly bugged me when someone says "let's put it in a database!"

A filesystem is a special case of a database. So it is perfectly acceptable to store your data into a filesystem. Some people seem to think everything has to be put into a relational database or that is it somehow cool to do so. I have seen people store loads of graphics as BLOBS in databases. Someone once suggested storing a ton of MP3's in a database. Most recently someone said (and this isn't the first time) that we should store all of the emails in a database. It's just another unnecessary layer of complication, especially when you are going to be referencing the email/graphic/mp3 by name all the time anyway (fs's like reiserfs index on name so it's blazingly fast) and not by a bunch of other pieces of meta-data. And if you are going to need to do lookups by various bits of meta-data then store the meta-data in a db and also store a record pointing to the actual file on disk. I have done that lots of times and it works great.

Neither trees nor relational is the best, perhaps by Tablizer · 2003-09-05 18:28 · Score: 1

The relational zealots are quick to point out that a relational system can model any sort of data. Indeed, it can do this. This does not, however, mean that it's always good at doing this. Sometimes it's the right tool for the job, and sometimes it's not......You can't navigate a relational system, which will prove to be the downfall of any all-relational....

I am not sure what your definition of "navigate" is. However, as someone who can be called a "relational zealot" (see my handle?), I actually agree that relational may not the best paradigm for this job. However, neither are trees, by any stretch.

There are two separate issues here. One is "trainability", and the other is potential power, once trained. I agree that trees are probably the easiest to learn, but also the least powerful. It is just like designing user interfaces: do you focus on newbies or power users? Power users do most of the work, so we can't just ignore them.

Anyhow, back to the ideal paradigm for this job. I have been kicking around something that is a cross between relational and object databases. Yes, this is coming from an object basher. However, we will toss inheritance out the door. Basically each file would be like a record in a relational table that has UNLIMITED potential fields (columns, but I will call them "attributes".) Attributes could be dynamically added. There is no "central schema", and this is where violation of true relational comes in. However, one could treat the table just like a relational table in queries.

For faster searchers, perhaps one could designate "static" columns, which each record (file) automatically has. These could be indexed in a regular RDBMS way. Outside of this, operations on dynamic columns (attributes) would be slower than traditional RDBMS columns. One would have to static-ify often-used columns to speed things up.

Actually, I think the dynamic column idea is closer to set theory than relational.

--
Table-ized A.I.

P2P storage? by HanzoSan · 2003-09-05 18:58 · Score: 1

It says it uses centralized servers, but if we had a way to make it P2P, or perhaps a P2P plugin, this could allow many many more features, you could have a database which learns about files and automatically corrects bad filenames etc.

"An example given is using the name and length of a Divx file to search IMDB to get all the relevant information about a film."

Why have a centralized internet movie database when you can have a decentralized P2P database? I mean everyone will essentially have a database filesystem right? It will be easy to use right? Why not make it P2P and let it self heal?

--
If you use Linux, please help development of Autopac

Re:GREAT! If it is done well... by Anonymous Coward · 2003-09-05 19:37 · Score: 0

With this, you could do more than just, say, "tar -xjf 'newest version of something'", you could just query for the newest version of something, have the db filesystem determine it's a tar after it sees that metadata and then take appropriate action by looking up what the helper app should be.

Nifty.

Then again, I can imagine this opening up all kinds of cans of worms. Need I say Outlook?

The Trials and Tribulations of HanzoSan by Anonymous Coward · 2003-09-06 01:32 · Score: 0

Dana Edwards was feeling a little disheartened. It had been nearly a week since he'd contacted Peacecorp and applied for a tour of duty in the Congo. He had hoped all week that his weight problem, chronic acne and asthma would not discount him from the program. Dana had been in some financial strife for a couple years now, with those tuition fees from Massachusets Bay Community College piling up. This was particularly stressful for him because, despite having taught himself to read and posessing an impressive intellect, he could not find a decent slack-off job with internet connection that would support his slashdot posting habit. Dana belched while he tapped his cordless phone and stuffed his hand into a bag of Cheetos. Dana, a Jack of All Trades had also been unsuccessful for several years in his attempts to get a night DJ position at a local AM radio station within walking distance of his mother's house. This distressed him, because being a DJ would be such a natural part-time job for him, being a skilled musician on the side. Alas, he waited still and finished the last fluid ounce of his Mountain Dew.

Peacecorp was going to change that. Where his business sense would have failed him in the Merchant Marines and his poor physical condition were not up to snuff for the military, he felt Peacecorp would welcome him with open arms and take his student loan burden off his hands.

"Education equals genius. Genius is good for society. I'll show them, I'm going to buck the status quo. I'm going to make a difference, I'll show them what a poor kid from the ghetto is capable of." Dana thought to himself.

Dana had not shaven for five days, but his greasy facial hair never became very thick, even after weeks of neglect. It grew in a thin, spotty Fu Manchu pattern. Best described, his whiskers resembled soot smeared on his greasy jowels. He scratched at his armpit and pulled the tightening fabric of his pajama pants out of his groin and sighed with relief.

"Aaaah."

Dana was glad that the weekend had finally come around. His Computer Repair Fundamentals and Sociology classes were starting to really dig in. He blamed the teacher for sucking, and was utterly convinced that his superior intellect would reward him with first in his graduating class of 40. He was certain that the same outcome would happen if he got into MIT, but that would never happen. The rich bastards would never give him a fair chance on a level playing field. The MIT bastards hate nerds, just like everybody else. That was alright though, Dana already knew he was superior to most of them anyway. Their facilities were only useful to the superficial.

Dana loosened up a bit by putting some music on the 'juke. He got a free MP3 jukebox from his mother and slapped an "RIAA SUCKS" bumper sticker on the side of it. Dana was vehemently opposed to the ownership and licensing of intellectual property, especially music. Dana downloaded all his favourite Pink Floyd tracks off the internet and onto the jukebox, and this brought a small amount of joy to his empty life.

"Damn the man!" he exclaimed, raising a fist as his gut flopped out of his oil-stained ThinkGeek t-shirt.

Ice T and Fred Durst alone had practically paved the way to justified downloads of all music ever created and served up on KaZaa. And so, Dana sat in in front of his monitor listening to The Wall, waiting for a reply from Peacecorp.

His mother slipped in to his room briefly to set down a balogna and cheese sandwich in front of him while he fired up a beta version of Transgaming on his Pentium 166 with MMX.

"Mom, why don't you hate the RIAA?"

She shrugged, rolled her eyes and closed the door to his room on the way out.

"She forgot to cut off the crusts." Dana held back the tears and ate the sandwich anyway.

[montemplar] wuzzup hanz0?

A privmsg came up on his IRC client. Dana had adopted the "handle" HanzoSan after his Japanese

Slashdot Mirror

'Storage' to Replace Traditional Filesystems?

599 comments