MySQL FS
xcyber writes "Developer, Database Admin and user, MySQL is developing an mysql filesystem for Linux to mount database on
Linux as a fs. This is still in development stage and the development
team would like to receive comment on this. So please let us know.
" "Because you can" dammit. Thats just plain awesome.
Amongst all these other examples, it's probably worth noting that SQL is a declarative language. Basically, it allows you to express the results -- without worrying about the procedure used to generate the results.
pooptruck
You're absolutely right that a prototype could be built using current file systems, but said prototype would be SLOW and eat a LOT of space. It's better to use appropriate data structures and algorithms.
And yes, file systems are databases; they're merely inflexible databases using ANCIENT technology. Not all databases are created equal.
-Billy
phexro!pyramid:~$ SELECT * from pr0n WHERE sex='f' AND species='goat';
--
Nice. However, first things first: any replacement for the current system has to start by doing all the things the current system does, at least as simply. This is the main reason I think 'cd' is a good command to include.
It's BAD to try for too much with the first release. If you'd like an 'object system', by all means prototype one using conventional directories; you'll decide quickly that it's little different from modern Unix (remember ioctl!). In other words, an overly complex solution.
We need a true file system, one in which ioctl isn't needed. See the latest plan9 OS for details.
-Billy
A while back (a year maybe?) Oracle announced their iFS product. Dubbed the Internet file system, it gave file system, IMAP, POP, FTP, and web access to the database through a common software. I haven't had the chance to work with it, and it still may not even be available, but to be able to store files in the database and enforce integrity, it's extremely easy to track revisioning, maintain lists, and perform searches and reports. It seems like wonderful technology that should be a part of every OS, but I'm curious as to performance. Has anyone had any experience with iFS?
LOAD "SIG",8,1
LOADING...
READY.
RUN
Sorry for my bad formatting :-)
/.
/.
Well, the story about this topic was at
and the flames I gotr are still at
Look it up for your own.
My point was: 90% of the people running linux do not care about GPL, OSD etc. they like free as beer software especialy if its stable and the source is included.
You can forgett my addition, I only liked to show the attitude several people expressed.
a'o's
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
A filesystem is also an interface to a flat file. /dev/hda1 is a file with
a fixed size and FS is a method of organizing different files inside it.
--
Escher was the first MC and Giger invented the HR department.
You can almost always access BLOBS (or equivilent fields) from different languages and environments, the problem is that they're all different. DBI, JDBC, ODBC, etc... each one has it's own gotchas. Being able to access these fields as a part of a filesystem gives it the ultimate portability. Even shell scripts can quickly and easily access the database!
Doug
Venn ist das nurnstuck git und Slotermeyer? Ya! Beigerhund das oder die Flipperwaldt gersput!
The part I found to be emminently cool was to think of things the OTHER way around -- don't just think of seeing your DB as files, think of seeing your files as a DB.
Imagine if you could move a bunch of Word200 documents (they ARE XML files, mind you -- you just need an OLE stream decoder a la wv to decode them) into a DBFS directory and provide the DTD for that datatype in a way that you could make SQL calls against the files (records) using the DTD as a table definition.
You could then move files TRANSPARENTLY in and out of the DB, using the DBFS and automatically index against them.
A few years ago, I was responsible for getting a bunch of documents on a website to be searched and sorted. I had directory upon directory full of procedure and process documents. One day, I found a program called Xerox DocuShare, (written in Python, BTW) and it used some features that were very similar to this idea.
Cheers,
Ken Crandall
The latest releases of MySQL *DO* support transactions...
All one has to do is program for a dynamic, RDBMS-driven website, storing (and retrieving) images on-the-fly to see what I mean, adding your point, above. Of course, websites aren't the only place this is an issue.
The "ultimate portability" and "Even shell scripts" points are really good.
AS/400, Palm-OS, Be-OS, Reiser-FS, etc., all do something like this now, as has been pointed out.
Of course, this now has to be implemented.
I think they said they'd never land a man on the moon, or something like that. Then, there was this group at NASA.
- Does MySQL have the same directed and well defined core of development that linux development had and has?[think kernel here]
- Is the current base of MySQL well written enough, with enough source infrastructure to survive eventual restructuring during concurrent feature enhancement?
- Is there, as there was with linux, little competition in similar projects offering a similar feature set that might attract more followers or be a better candidate than MySQL for development attention?
I'm not saying that MySQL lacks any of these. But there are tons of opensourced projects that just needed a bit of getting better that never did because they never really were good enough on a source level. Linux is a lucky case, but take heart, if there hadn't been linux, you still could have run the most fo the gnu system on BSD thanks to GCC.Lastly, I'm largly unaware of any linux-only apps that actually make or break a user's choice to use linux vs. any other unix. I think what really makes or breaks the choice is price-point and percieved momentum. pauvre pauvre netBSD.
-Daniel
Well, no. It was too daunting for me. But, I'd like to.
I recently started making a database where I could keep track of all my photos - a "photo database," if you will. (It's here, if you are curious.) I didn't store the photos in the database - primarily because there isn't enough room on the database server. I numbered all the images by hand and serve them from my personal computer - using MySQL and PHP on the database server to access them.
Anyway, I want to organize my photos into groups - and maybe even subgroups. And I want the groups to be able to overlap. I haven't done this yet, because I don't have the time to (re)impliment a file system inside the database! However, a "dbfs" seems to be exactly what I need for this task. It's close to what I envisioned.
"Pinky, you've left the lens cap of your mind on again." - P&TB
"I can see my house from here!" - ST:
I like! I like! Theoretically, you could get faster data access since it would write like a raw device like oracle or MSSql
..and that is why beos is just awesome.
I've recently been evaluating high end NAS/SAN (network attached storage arrays / storage area networks). The rep from netapp (makes high end NFS based devices) said Oracle is using their devices as the backing for their new E(whatever) service to compete with MS .NET. ... AND in the process, Oracle has forgone using their raw block access methods for, you guessed it, NFS-based Oracle database (to connect to netapp's netfilers) powering their own site. He said Oracle found they could map to netapp's high performance custom NFS file system and start to free up a huge engineering group devoted to optimized raw block access architectures.
Sounds pretty f'd to me. And not surprising to hear from netapp, given their whole universe seems to revolve around NAS (NFS) as opposed to SAN (fiber). Anyone out there actually running Oracle on an array via NFS ?
How about:
char buf[] = "foo,bar,bally";
if (write(fd, buf, sizeof(buf))
if (errno==EINVALDATA) {
fprintf(STDERR, "Invalid data\n");
}
}
The standard error codes (as specified by man 2 write: EBADF, EINVAL, EFAULT, EPIPE, EAGAIN, EINTR, ENOSPC, and EIO) don't really cover that scenario, but any non-zero value from write indicates an error.
My question is how to form a path to a row/column (and are you forming paths to rows or columns?) - would it be something like /db_name/table_name/column_name/value or /db_name/table_name/columns_primary_key?
You are in a maze of twisty little relative jumps, all alike.
As well, this would be fantastic for configurations (in particular the complex ones of Gnome and KDE) since large amounts of data could be elegantly compartmentalized in a standard way. I find this nifty with the growing complexity of filestructures in these config sets, they would be open to editing and updating through the standard filesystem method, or through a standard SQL query system.
You still cannot provide anything universal or that can be done by an end-user. Only having fs access to a db allows for this. First of all, name one universal BLOB that works exactly the same for all db's that support BLOBs (there aren't any). Name one standard SQL command that does all this. Name one standard piece of source code that works in all languages, all the time, for all OSes. Name one totally standard API/interface/protocol/whatever. None of the above can be done.
Unfortunately, BLOBs are not universal. Nothing works exactly the same way everywhere, all the time. And, let's just assume that you'll be using one DB on one OS all the time in one programming language, just to make things as easy as you claim it is. Things still aren't clean, since you will have to include code repeatedly in all your apps. Possibly, you may have to change your code if your tables change. Assuming you do everything the "right way" and use an interface such as ODBC, JDBC, or DBI/DBD, and assuming you write good OOP that is generic, you still cannot take that code everywhere to all apps all the time, even in the same programming language/OS. There will always be porting, adaptations, and recoding to get this to work everywhere with all your apps. In fact, everything needs to be planned so that all apps that will be using your code should follow the same conventions all the time.
To avoid this mess and to make life easier for end-users, we could mount the DB as a file system. This gives apps, APIs, libraries, OSes, end-users, etc. the ability to query, read, write, and modify data, even if the platform doesn't even support SQL! By mounting the DB as an FS, you give (nearly) all apps the ability to work on your data (where db data==files), just by being able to open and close files! This is the ULTIMATE layer of abstraction, making access truly UNIVERSAL. (Security restrictions/permissions still should apply, of course.)
There is absolutely NOTHING universal about what you suggest. Nor do all BLOBs work, even in theory, as you suggest. Nor do end-users benefit. Nor do all apps automagically get access to your data just because you wrote something in language a for database b for OS c to support program d.
Oracle 8i's IFS, Informix's data blades, MySQL-FS, PGFS, etc. all have been written by the db experts to address these deficiencies. Why do they disagree with most of the people on this post?
It's because they're right.
Section 14.2.2 Has a discussion of File Systems versus Databases written by M. Satyanarayanan (of AFS and CODA fame). He says that although file systems and databases have much in common, there are several areas in which they differ conceptually including encapsulation, naming, and the ratio of search time to usage time. Basically file systems are appropriate when there is high temporaly locality, while databases are used in situations where there is little locality and concurrent read and write sharing of data at a fine grain level are required.
trollalicious.
Nothing about flickering, though.
Stating on Slashdot that I like cheese since 1997.
Beos Specs.. scroll down.. basically (as anyone thats used beos knows) file information is basically stored in file attributes. ie: Email is just a text document with a sender,recipient,date,etc attribute with the body as the info in the txt document. This is really usefull for mp3's every bit of an id3 tag can be given its own attribute and then you can search based on the attributes and such..
All of the high end RMDBS use raw file system access. By not using a "regular" file system, you gain a huge performance jumps. If MySQL is doing its own locking and recovery, then the overhead from the file system is wasteful.
Oracle has been doing this for years.
This is just another important step that MySQL needs hurdle before it is concidered for high end applications.
-b
All it does is let you use filesystem calls to access the database. I mean, MySQL uses files that are managed by the Host System's filesystem. So, basically, all it's providing you with is another API that can access the database. That is nice, but I wouldn't necessarily expect anyone to use this for actual data storage. It would seem to me that to use this for the same purposes as a filesystem, it would only add overhead to the process, and provide limited - if any - added data storage benefits.
-------
Oh shit! I forgot to click "Post Anonymously"...
They even have no foreign key support, no subqueries... In my opinion they have some better things to do than this (although it might be usefull).
Select * from /dev/hda1 where filetype="mp3" AND artist="moby" and bpm>120
:)
that would be SO cool
---
And, I'm sure somebody out there probably thinks the whole directory structure is too relational, or db-like, already. Give me a break!
Yes, a file system puts flat files inside a flat file; drives are more logically mapped anymore than they are physically mapped; RAID, LVM, and partitioning in general divvy that up into pieces; databases put relational data into a bunch of indexed flat files; BLOBs are nothing more than flat files stored in databases.
So, what is a drive or a partition or a file or a database? The lines are already so blurry you can't tell the difference, anymore, and you wouldn't want to, unless you want to go backward!
Hey, that's MySQL we're talking about here. Why do you even care about relational integrity?
I strongly believe that trying to be clever is detrimental to your health. -- Linus Torvalds
http://www.etoyoc.com/odie
Don't Panic
This sounds vaguely similar to how PalmOS apps work.
THe entire filesystem is based around the idea of a database, where memory chunks are accessed based on the name of the app, etc....
Warning: I'm human. Sometimes stuff I post here is wrong. Use your head. Question authority
this must be one of those times. without some way of querying your file system (except for ls) then you loose the relational aspect of the database. then you just have a filesystem that stores metadata for fast recovery. this is good from a fs point of veiw, but it does nothing to help you find the files you are looking for. it provides no relations between files, and does not store file descriptions (that are useful to a human).
eg. this is an image file that can be catagorized under political, humor, bill clinton, letch, etc.
use LaTeX? want an online reference manager that
-- john
OK.. but which is it? You first say if you were to measure it to the "micro second, some difference might be noticed". Then three sentences later you say there is "no difference". You seem confused, and you were right the first time.
There is a difference (however minute) and it's due to BeOS not actually having the ability to create fields at the FS level. It's trivial to create file examples that show off the flaws in their method and to make peformance suffer. There are filesystems that do it better, through better architecture. Realise this and you'll see my point - that BeOS can improve it's so-called meta FS.
ps. I've used BeOS for several years now on PPC and x86, programmed for it, and you're probably using some of my apps (here's a hint - Doublin). Oh, and there's much more anger and arrogance without fact or links or proof than anything I'm putting out, believe me.
-- Eat your greens or I'll hit you!
-- Eat your greens or I'll hit you!
In my opinion it would make more sence to define a general mapping of (R|OO|XML)DBMs into the filesystem.
/etc/passwd into a meta level (the description) to examine/querry normal file content with SQL.
/etc/passwd where shell != /usr/bin/sh
/etc/passwd where shell != "/usr/bin/bash"`
If mySQL is only a prototype for that this is fine! However if this is ment as an improvement for mySQL I doubt it is the right step currently, as there are a lot of more demanding features like locking, cashing and general peformance in multiuser environments.
And of course it would make sence to be able to describe table mappings from existing standard unix configuration files like
e.g.: select user, shell from
or better: echo "why don't you use bash?" | mail `select user, "," from
Regards,
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
I've seen presentations about Placeless Documents and it's really cool.
A: Flat files stored in a database!
Q: What is a database?
A: Data stored in indexed flat files!
Q: What is an index?
A: More data stored in flat files, or a database of metadata that relates to the order of data!
Q: What is a file system -- Unix/Linux?
A: Flat files stored in a database (that's why it's called a file system) -- with at least one flat file, such as /dev/hda1!
Q: What is a hard drive (present day terms).
A: A device that logically maps data on a physical medium, or a database of sectors!
Q: What is version control, such as CVS?
A: A database of changes to flat files.
Q: Why don't (many) people get this?
A: Maybe they haven't been a DBA, haven't taken a class about modern operating systems, haven't developed a dynamically-generated website with images, haven't been a frustrated programmer dealing with dynamic and static data simultaneously, don't organize data well, haven't read about new FS'es (such as Reiser-FS), haven't used asset management software, think there really is a distinction between filesystems and db's, store all data in flat files, etc.
The BeOS filesystem has a meta field along with conventional name/date/etc., fields for the FS. I'm not talking about high-level constructs that abstract and hide what's actually taking place - I talking about the FS and how it saves data. It has to parse the meta field. It can't actually add fields to the FS - although it acts that way.
Congratulations on the +4 though - good work.
-- Eat your greens or I'll hit you!
-- Eat your greens or I'll hit you!
Could someone do me a favor, and explain exactly what an SQL is?
I'm kind of confused at this point, and haven't been able to figure it out from readers' comments.
Thanks
Hmmm. Interesting concept. Not sure what the use would be. 'where' would be easy to implement in SQL, though.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
That sounds similar to the inherent capabilities of the BeOS filesystem.
WTF? The was closed on preview! Grr, I hate Slash.
You are in a maze of twisty little relative jumps, all alike.
Dude, he didn't say that he didn't want _any_ way to query a file system. He just said that some might not want a full-blown SQL interface to a file system.
"UNIX" is never having to say you're sorry.
This is a great idea if it's implemented well. The AS/400 is an example of a system that was entirely implemented around the idea of a full-featured DB implemented as a filesystem.
...and since it still has one of the best uptime records in the industry, and transaction processing times that consistently rank in the best-of-the-best lists, it's a good platform to imitate. Too often it's overlooked because of the green-screen terminals, but at its core, the AS/400 is easily one of the most advanced implementations of computer technology available to the general public.
Well, SQL wont work --I just don't need (or want) the slow down from interpreting and abstracting SQL commands --and yes, I want *all* the speed I can get.
I am looking for something way lower level. ReiserFS isn't a bad solution, I just dont believe that their plugin API is mature enough to base another project on top of --or am I wrong?
I don't understand how you can say this without knowing the benefits of this venture. I have not seen very many cases (although there are some) where someone takes a good working program, and decides they are going to spend a huge amount of time and effort working on something that is detremental to the project. I don't think that companies like Oracle would go ahead and pursue a "brain-dead" project just for kicks. Perhaps vast leaps to judgement should be withheld until a somewhat informed decision can be made.
mfkap
http://foldoc.efnet.org/cgi-bin/foldoc.cgi?SQL
quote "According to Allen G. Taylor, SQL does *not* stand for "Structured Query Language". That, like "SEQUEL" (and its pronunciation /see'kw*l/), was just another unofficial name for a precursor of SQL. "
Vermifax
Vermifax
Logout
Good point -- didn't think about being able to use the "Find" function of an OS.
However, how is this better than a dedicated web app for HR flacks?
Find your favorite non-computer-literate person and see if they even KNOW that there's a "Find Files and Folders" in their Start menu? (I'm assuming a Windows-centric office here)
Like I said, I think it's cool and potentially useful, but probably not as useful for non-nerds
Potato chips are a by-yourself food.
I'm not sure what you mean when you say 'staticly' or 'dynamicly'. I suspect, however, that you're assuming that the foundation of a fluid file system is a set of files, directories, and links. It's not. It's almost certainly a relational database, one optimised for the task of getting a set of items (objects, files, whatever) which are categorised under a given set of categories.
I also don't know what you mean by 'number of possible categories'. I think you're mistaking 'categories' for 'sets of categories'. In my example, "/etc/wtanksle" is a set of two categories; "etc" is a category. I could see some reason to cache the results of category queries; that's an optimization concern, and not my specialty. I don't see any reason to try to precache all possible queries, as you seem to imply.
Its speed will be almost irrelevant; I predict that it'll be about as fast as the current system, but even if it's hundreds of times slower it'll still be fast enough, since caching is trivial and looking up a file based on a full filespec is almost never done.
-Billy
And still up to today I do not use any.
License issues never where a reason for me to use linux.
Most peaople who currently use linux, especialy those who pay 50$ for a CD distribution realy do not care about the license.
They care about:
you can it install company wide for no extra charge (well the only licensing issue)
The only reason for me was: its UNIX and it is free in terms of beer, and it runs on 486 architecture and the gnu file utils ran on it as well as RCS.
Regards,
angel'o'sphere
BTW: once there was a hughe rant about Troll Tech releasing a new version of QT, and for linux also one under GPL. Troll Tech wrote somewhere: "... we did not emulate the slow and flickery refresh of GTK(or was it gnome?)
This led to a hughe contovery
Only what I can say you geeks, you must have very expensive machines
IT FLICKERS. ITS SLOW. Like hell.
If its a good application I do not know as i found the user interface VERY confusing.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
At the moment the proposed filesystem is read-only, and that menas locking is not an issue. Besides, the primary purpose of the proposal seems is to provide more convenient interface for very simple databases. So it will fit perfectly with MySQL, making it even more simple to use and also even more feature-poor. In my opinion it's a better interface to MySQL than its existing brain-dead subset of SQL.
Are you the party soliciting feedback for the MySQL File system?
If so, I have an interesting file structure scheme that may help you notationally translate tables to files and directories in a logical fashion.
I call it "the Woods Hypertree" In it you have two seperate delimeters for two different parent/child relationships.
One relationship is hierarchial, like a standard file system: B is a subordinate of A
The second denotes a structural relationship, like a set: B is a component of A
There's more info, and a C and TCL implementation on my website: http://www.etoyoc.com/odie
Thanks if I tagged the right person, Sorry if not, and could you please forward it to the right folks.
Sean Woods
Senior Network Engineer
The Franklin Insitute Science Museum
swoods@fi.edu
xcyber """"""Complexity for the sake of complexity is not a solution, neither is simplicity for the sake of simplicity
Or ask your mommy.
Blar.
I attended an Oracle conference earlier this year, and saw a presentation of their iFS. Let me tell you, I had been thinking of something along the lines of it for quite some time and it was very cool. While I'm not so sure I would want to use Oracle (expense, size, etc.), they have put quite a bit of work into this and it ROCKS.
All access to files is accomplished through a middleware "plugin" which exports the data out in different formats. (FTP, SMB, etc) All your data is stored in tables in the DB. Every file can keep revisions of itself, and every file can have arbitrary attributes attached to it. It is also quite fast for BLOBs and CLOBs (the demo streamed a 100mb MPEG from the FS via SMB).
It was a little shakey still at the point, but I would imagine that they have worked out a lot of those bugs in the last 4 months. If I remember correctly, they will ship it with 9i or something.
The copper bosses killed you, Joe. 'I never died', said he.
Would the kernel be anywhere near where it is today if people hadn't gotten others interested by writing intriguing, linux-only apps? Probably not.
Your analysis is wrong.
Most anything shipped on linux distro is nothing more than a Unix program PORTED.
Unless GCC, X and others are 'linux only' apps.
If it was said on slashdot, it MUST be true!
Aren't all file systems truly just databases? FAT is just a table of file pointers and physical locations, isn't it?
Oracle is doing (or has already done?) this with an implementation of their database server.
Heck, I think that MS should run NTFS (and probably Active Directory) off SQL in some way.
Makes sense.
I'd like to encourage people to think more along the lines of "what if" than the "that's ridiculous" attitude... :)
-Chris
Early version of BeOS did use a database FS for the entire system but they dropped it by R4 (I think that's the right version) because of performance issues.
-- Eat your greens or I'll hit you!
-- Eat your greens or I'll hit you!
Please explain what commit and rollback are.
Commit is what you do to a felon. Rollback is what your car does when you forget to turn the wheels and pull the handbrake on your car on a hill. Rollbacks can lead to commits if your car accidently hurts someone.
In a transactional database, they do what they sound like. Commit "commits" data to the table and "rollback" throws away your changes. Saved my bacon a few times. And possibly my job.
Without you I'm one step closer to happiness without violence.
Much of the ultimate point of ReiserFS is the marriage of databases and filesystems (filesystems are really just a limited sort of database anyway). This is the reason for the all the commercial funding; there are people out there who really want this.
See Hans Reiser's White Paper for information on where he's going with this.
For what it's worth, database filesystems are not a new thing at all. Hans is just planning on accomplishing this in a way that completely preserves the Unix file metaphor and related concepts.
DNA just wants to be free...
the 3.23.xx releases of MySQL do support commit and rollback. What's more interesting is the ease with which you can run MySQL db's over multiple servers - this could make a great foundation for a simple, fast distributed filesystem...
A database is about data. The data is partitioned into tables and columns, with a large number of additional constraints (unique, primary key, foreign key and check clauses, for example) to limit the values of the data. Additionaly, the data is strongly typed. In order to access this data, SQL supports very high level commands, like SELECT, INSERT, UPDATE and DELETE.
The power of a database is most basically in its very high level nature. You, as the user/programmer, do not care where the data is, who else is using it, how it is stored, or what the old values where. The database management system takes care of all of that. Other powerful features of databases include indexes, joins, subselects, real NULLs, aggregate(set) functions, and GROUP BYs (sub-setting).
Now, contrast this with the low level file/directory structure. In this, you have a hierarchy of directories, each of which contains one or more files. A file is nothing more that a stream of bytes, and the only constraint they have is that they be uniquely named within their directory. Also, a single file can be in more that one directory.
In order to use a file, the programmer must know where the file is, possibly who else is using it (with lock files, for example), what format the data is stored in and, if they want to be able to undo their actions, the old values. The advantages to files are the plethora of tools for manipulating them (at least in the case of text), and lower startup cost (eg. it takes less time to make a stupid file format than a SQL schema).
This project is therefore brain-dead as an application development platform. 'But,' I can hear the reply, 'it's useful for users who want to change the data in the database.' Reply: every database accepts SQL, which modifies the data. Some SQL API's I've seen only take two lines of code to retreive some data. And SQL won't shit on your data if you accidentally type it in in the wrong format, it'll conplain, but your data will be safe and secure.
This is quite possibly the worst idea I've ever heard. Worse than Linux as an Internet Explorer plugin, worse than Napster as a family tree generator, worse than Quake III as a spreadsheet, and even worse than Apache as a VMS shell.
Not that I have anything personal against it.
Yes, I'm still a junky. Are you still a bitch?
I've used a system that sounds a lot like this. It's called "ClearCase" and it's a configuration management system.
It manages files in a big database and tags them with "versions". To modify a file, you check it out and to save it to the database you check it in. Nothing particularly innovative here. The cool part is how you "see" the files you care about.
First you create a "view". You can have as many of these as you want. Each view is configured to select files based on a set of rules. The rules are simple, like LATEST (most receintly checked in) or VERSION 5 or DATE(1/16/2001). The rules can be applied to all files in the system or specific files or directories.
What this means is that you can have multiple open views and list a specific directory in each of them and receive a different listing for each view simultaniously. The performance is very good across a LAN (the file system is server based)
You are able to attach other metadata to each file or version of a file. Attributes can be created and values assigned.
The configuration language is not SQL and is not designed to be very dynamic. But it is extremely usefull for development project management.
See Praedictus' ERDB product - it is not free (libre or gratis)... it is a 'relational database' that resides in memory (the entire DB is in RAM at all times) it is very fast. I presently have a DB used for Statistical Process Control in a production facility. So far im very pleased with its stability and the support from the developers...
They publish a C API for interaction w/ the 'DB'...
I know you said 'Open Source' - but I thought Id bring this to your attention...
This sounds really cool, but it seems there could be some problems with implementation. If you build category listings dynamically, this drastically slows down tasks like a simple directory listing (or even locating a file by name), because you start having to do searches. Of course you can speed this up wi/ good indexing, but you still have to pull those indices off the disk and do a fair amount of processing.
/bin, /temp, etc. OTOH, it would be great for home directories where the user is mostly storing documents and a relatively minor performance hit isn't noticeable.
You might be able to build some of the categories statically, but if your fs is truly fluid, then the number of possible categories is gonna be too huge to build and maintain statically. Maybe it needs to be a little less liquid, or maybe you can find a way to indentify commonly accessed files/categories and build that stuff statically, then do everything else dynamically.
I also think this needs to integrate with rather than replacing a traditional fs. I doubt this method will ever be as efficient in terms of looking up, creating, and deleting files as a traditional fs, so it would be bad for system stuff like
Ken, you can do a lot of this right now using the Exchange 2000 server. You'll be able to do a lot more of it with the upcoming Sharepoint server (beta Tahoe). It's wicked cool. Beats the stuffing out of DocuShare.
The early versions of BeOS used a separate database (not very complex) and filesystem, which wound up being very difficult to work with, so eventually they merged the two. The "database" aspects of the BeOS filesystem are more of being able to add (relatively) arbitrary data to particular filetypes, and do searching based on those criteria. It isn't a formal database in any sense of the word.
Versions of BeOS prior to the Preview Release had a file system and a separate database. Because it was difficult to keep the data in the two separate systems consistant, it was decided that they should merge. This happened in Preview Release 1, and BFS remains relatively unchanged today.
At the time there was a lot of enthusiasm for the merged design to be a database-based file system, but after a lot of research, Dominic Giampaolo, the engineer doing the design and coding, determined that wasn't going to work. The reason is it becomes too difficult to filter out the files you aren't interested in. There is a lot of organizational value in a hierarchical, structured, traditional file system.
The design for BFS that was implemented is best described as an "attribute-adorned file system," with a query engine that can search against the attributes, and some indexing to make common queries fast. There's a fairly simple query language (along with simple GUI tools), but it's not as complex or capable as SQL (nor would you really want it to be). You can execute those queries from the command line if you want, which can be pretty useful when piped to another program (much as find is in Unix, but simpler to work with).
You... are very twisted, as is the creator of that website. When will people grow up and stop pasting that SHIT, its not funny.
ascii 0 => \0
ascii ' => \'
ascii \ => \\
Then you just insert the damn thing into the blob-field like any other data:
insert into pictures values ('bill', 150,60,'actual-jpg-file-here')
So sure. Big deal. If you think this is complicated and/or hard you should find some /even/ easier job.
NTFS5 is a real database, though it may not be a real management system.
--
It's called the desktop database. It allows one to do cool things such as change the icon for a single program document (no messy MIME/extensions), store comments on files, and all other sorts of GUI goodness. And it makes searches really fast. Of course, it's probably a little crufty (being over 16 years old. I think it's a flat database), but I think it's definately time linux had a feature such as this.
Damn, you...both of you stole *my* idea! ;)
For a long time now I've been thinking about filesystem-as-database concept. We've passed the point where computing is about optimizing hardware resources. It is now about optimizing *user* and *information* resources. If your hardware is blazingly fast, but you are lost in a sea of irrelevant information, you can't do anything. I think that's where the database/meta-filesystem comes in.
With all this rich content around, we should not be searching for files based on some arbitrary linear categorical name. We should be searching on *attributes*. We should be searching on *association*. E.g., "List all files relating to my work that I have store on my home computer", "Now, of those, show me all files that pertain to status reports". Or "List all data I have on the artists and bands in my music collection". etc.
This is where plain, flat, hierarchical file systems fail. We need basically a data "repository", and various ways of obtaining information from that repository, based on attributes, categories, mime types, relation to *other* files, etc.
It's 10 PM. Do you know if you're un-American?
NTFS 5.0 already does this as well, with alternate data streams and native property sets (COM object attributes like author or title, or even custom properties).
The fact that it can be indexed so both content AND properties are searchable sounds a lot like a database to me.
Bruce
Bruce Perens.
In my vision, 'documents' would be categorised, and the categories could be viewed in a manner very similar to how we now view directories, except that a file is in more than one folder at a time. A file which is named /etc/wtanksle/ppp.conf could also be referred to as /wtanksle/etc/ppp.conf, or if it's unambiguous, /etc/ppp.conf. /dev/removable gives the list of all removable devices; /dev/scsi gives the SCSI devices (including the removable ones).
The potential uses are many -- I think it would make a lot of common computer tasks a lot easier.
Oh well -- anyhow. :-)
-Billy
By creating the whole FS around the DB could make the whole thing a lot more consistent. If you know that the FS will be used for DB only, why not choose file management methods that fit. This would enable you to have records as your basic file unit as a built-in structure. Sounds like a great idea to me.
~ "When I'm of that age I'm just going to live up a tree."
Instead of doing it manually, use RipEnc to rip the CD and fill in all the attributes (and ID3 tags, too), based upon a query to FreeDDB.
Bill Clinton: Pimp we can believe in. - The Shirt!!!
Take the linux kernel. Would the kernel be anywhere near where it is today if people hadn't gotten others interested by writing intriguing, linux-only apps? Probably not. Perhaps one day MySQL will evolve to the point where this will be useful, perhaps due to developers attracted by this project.
"The question of whether a computer can think is no more interesting than that of whether a submarine can swim" -EWD
Exposing the Table Structure as directories and data as files (or some mix of that) would be really nice. Really Really Nice...
:-)
Esp. If you can use things like find, grep, sed and all on them
Palin...
while it isn't a true database, it has the most practicle features. yes, new attributes can be added, but it takes a couple command line manipulations, which takes like a minute, then bang!, new, fully searchable attributes.
"One man's "magic" is another man's engineering."-- Robert A. Heinlein
The everyday user won't exactly go nuts over it, though.
The site gives the example "imagine marketroids browsing through the directories to directly access columns and entries" (or words to that effect)
No way. Hey, don't get me wrong, I LIKE that idea, and it gives me a pretty cool idea for a couple of projects that I'm working on, but think carefully about it: any sufficiently useful database for a large company is also sufficiently large that a directory tree is absolutely the slowest and most confusing way to access data held within a database.
For example, let's look at two examples:
It's not bad, but it's not as good. Plus, with good programmers (and good communication between programmers and management), the SQL is so abstracted out, it makes no difference. It gets condensed to a list of names and a checkbox next to the names. Those that get "checked" get a raise to $100,000.
To be truly useful to non-programmers (or non-analytical thinkers, if you will), the MySQL-FS would have to abstract out so much of the Database, you're back to a filesystem and a set of scripts to update a MySQL database.
It's cool, but it's not for your regular joe. Beyond a couple of levels, the average computer user gets lost in a heirarchal filesystem -- assuming they don't fill it up with "Untitled Folders" and such.
Potato chips are a by-yourself food.
The Exchange 2000 server does this now. Out of the box it support sharing the entire server tree of folders and mailboxes as network mountable drive volumes. You can directly read/write/delete to any of the folders (or items) in the Exchange server. Provided you have permissions, of course.
This is because NT supports Installable File Systems. The Exchange server links into this and thus allows anything that can access files to see the data. It's based on WebDAV.
--
Thanks
Bruce
Bruce Perens.
You can compile your SQL statements at the beginning of your program, so that they aren't reinterpreted later. Thus, unless load time is essentual to you, you may be better off with an existing database.
Regarding the ReiserFS plugin API, you're probably right. However, you don't necessarily need plugins if your project is simple enough. That is to say, if all you're doing is associating a set of data with a key, you make a file (named by the key) and put the data in. Need multiple keys? Use symlinks.
If your project is of some size, lightweight file support will likely be done before you are (it certainly will if you throw Reiser some money -- he funds his team that way).
Really, though, I think SQL is almost certainly your best option. The hashing and cacheing done by most modern databases more than makes up for whatever speed is lost to SQL support -- and once again, that speed loss is a load-time thing only if you write your app correctly.
Unix: "Everything is a file" Linux: "Everything is a file except for the files, which are records."
Go mySQL, go! Having said that, such a filesystem/operating system exists in OS/400, the native operating system for the IBM AS/400 (now known as the i-series e-server400 or some such crap). It's probably the most bad-ass box ever made that's hardly known (user satisfaction and loyalty are so high that IBM doesnt' seem to feel the need to market it much at all). Hundreds of thousands are in use at businesses around the world, wherever reliability, scalability and heavy database work (DB2) are in need (e.g. Las Vegas, hotel chains, reservation systems, financial systems, etc.). It's been around since approximately 1988, with the IBM System34, System36 mainframes its precursor. It's constantly updated to meet the demands of a technologically changing world. It's got one of the fastest implementations of Java around. It can potentially be one of the most secure boxes around. It's even going to be running Linux (and will be able to do so in many, many partitions on a single box). I fondly remember when I went from a 32-bit CISC-model AS/400 to a 64-bit RISC-model, well before any other major systems were offering a 64-bit world -- ALL WITHOUT ANY CHANGES TO THE ORIGINAL SOFTWARE RUNNING ON THE BOX! I know there's other AS/400 users who read Slashdot (I've even talked to a few). We often chuckle about stuff like this. Sometimes we just gotta release.... :-)
Wow.. i'm convinced. Certainly much evidence be had there!
Furthermore, he says you can define your own fields in the FS was from an option in a menu (this far too high-level). The fact remains that the filesystem doesn't have defineable fields, though it pretends to.
What's the difference if it acts like it does and is transparent to the applications? Easy. In simple situations (such as putting an mp3s ID3 in the FS) this has little of a performance hit. But if you were to put 100 meta attributes in the FS and store various sized chunks of data the performance degrades very quickly.
You just cannot read the meta as fast as the other types - oh gee, and I wonder why?
I mean it's an interesting hack but that's all it is. Genuinely adding fields to a filesystem would be impressive and as fast as any other FS request and.. well, it's been done. But not by BeOS.
-- Eat your greens or I'll hit you!
-- Eat your greens or I'll hit you!
Or D3 or whatever it's called now...
Why is there only one Monopolies commission?
I don't really understand what you mean when you talk about how hard it is to handle BLOB fields...
Now I have actually never done this with _MySQL_... but I have written a bunch of perl scripts (running on Linux machines) accessing Microsoft SQL Servers (using DBD::Sybase with FreeTDS libraries if you're interested) and it's really easy to insert or select data from BLOB fields (or IMAGE fields or whatever Micros~1 feels like calling them).
I can't really see why this would be any harder with MySQL (which I also work with, but on other projects, where we currently do not need to store "non-text data", or whatever the contents of a blob field should be called)...
--
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
Hmm... yes, Reiser maybe a way to go. Some benchmarks are in order. But, alas, SQL-based DBs are still too slow for what I am planning. SQL commands/queries etc. maybe interpreted to some intermiadate language/bytecode, *but* the real slowdown comes from the abstraction layers needed to support SQL queries and the like.
:-)... That's why most high-end datamining applications don't use RDBMSs...
Again, for a normal application you're absolutely right. But if you want to push/crunch a few GBs around in a coupla minutes, every little slowdown counts
Well, I am glad you're happy, but just about anything implementing a b-tree or skip-list implementation exclusively in RAM will get blazing speeds. The problem of course is, what happens when your application's needs exceed practical RAM sizes (say 7-8GBs these days)?
I think a well-balanced solution with cache and FS-level access (ReiserFS maybe, in a coupla years from now) will do better. Although, I am really more impressed with SGI's XFS.
The AS/400 uses a relational database as a universal data store for all system, application, and user data resources. The database is protected with very fine-grained access privileges and managed with well-defined administrative tools, which dramatically boosts security (since there is only one global security mechanism to manage all system and application resources).
This approach also simplifies development, which helps to make the AS/400 such a powerful application engine.
No, not quite that low-level :-)... B/B* tree implementation and the ability to handle well over 2GB of data comfortably (speed wise) is also a must --say around the neighborhood of ~1TB. Multi-user capabilities are also good, and ACID would be cool, but not a must.
It is far from a given, that all big commercial databases run without filesystems.
The Oracle DBAs here have learned the hard way the value of having something like Veritas filesystem under there databases.
Even on AIX, the Sybase DBAs prefer to run over LVM.
Its a different world when databases approach 1TB.
The abstraction layers on *your* end or that of the database? The former don't need to exist (one word: "inline") and the latter have been optimized very, very heavily.
Not all SQL-based databases are alike. If you have the hardware budget for a {SMP,clustering,mainframe} system, a good RDBMS will take advantage of it -- something which might not be said of solutions optimized to perform well on lower-end hardware.
So, yes -- do your benchmarks, on hardware comparable to what you'll be using for your actual production system. And don't count SQL-based DBs out yet; I would be entirely unsurprised if the overhead which makes them flexible is more than made up for by the heavy optimizations done elsewhere.
And... as for the thing with "cat 'SELECT blah FROM blahtable' >/mnt/mysql/queries/testquery"... first of all I don't understand your use of the cat command... but whatever...
Here's what I've been doing now and then for a long time:
"echo 'SELECT blah FROM blahtable' | mysql blahdatabase -p >/mnt/mysql/queries/testquery"...
(Isn't that about what your thing is supposed to do?)
--
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
- ReiserFS provides a way that you should be able to efficiently build a DB hierarchically as a set of directories and files, where files are the "leaf nodes" that contain field data, and where you might use symbolic links to represent secondary indices.
- In contrast, MySQL provides a way of representing "structured data," with "strongly typed fields." And the filesystem view provides a convenient way of looking at that data.
In effect the ReiserFS approach is to provide a way of building "weakly-typed" hierarchical databases; MySQLFS provides a way of putting a conveniently-browsable hierarchy on top of a strongly-typed relational database.It would provide pretty "weak typing" of a sort of TCLish style where "everything is a string, sort-of."
There are probably a lot of useful applications out there that wouldn't care much about the distinctions. That probably parallels the way that a lot of applications out there don't really care that MySQL does not satisfy the ACID properties or offer triggers, foreign keys, or other such things.
It also might be regarded as parallelling the way that Lisp-like languages have "strongly-typed data" with dynamic typing, which is a bit the way ReiserFS might be used, whilst "MySQLFS" looks a bit more like the "static strong typing" of ML/Haskell. Which is a rather weaker analogy...
In any case, the distinctions between ReiserFS-as-DB and MySQLFS are fairly strong. MySQLFS looks a lot, by the way, like the NameSpace concept in Casbah.
If you're not part of the solution, you're part of the precipitate.
Ok, ease of use I totally buy. It would be cool to open a file and have it stored in the DB.
BUT: The dangers seem really really high. What about foreign keys? If I'm just blasting out a byte stream who is checking the constraints and triggers? If I do something illegal, there's no way to let my program know?
try{
db.println("foo,bar,bally");
}
catch(DBFSInvalidColumnException){}
??? How do you do that in C or shell script?
As for people worried about performance, don't get so excited. Performance is vital when tools are going to be built on top of other tools, but if you're building software specifically for people who are tools (i.e. marketdroids that are unwilling to manipulate a DB directly) then they can pay a cost in performance in exchange for the addded simplicity of interface.
I believe that is one of the goals of ReiserFS as well -- that database vendors use file systems to store data instead of having to use raw disk partitions, or deal with file system overhead plus database overhead...
Matt Barnson
Matthew P. Barnson
I learn what I think when I read what I write
I am a relatively ignorant user of linux, I only know what I need to know to do the job at hand. But the way I see it this would seem to be a good way to centralise and standardise configurations. By allowing programs to access their configurations via sql calls and the user to change them through the fs then it allows for easier setups. Not to mention writing third party configuration utilities would be a lot easier it would seem. I may be way off the mark here but I see it as a boon to the administrator if the programmers choose to take advantage of it.
Lets see...
/mnt/sqldb
goober:$ cd
goober:$ ls
USER_ID FIRST_NAME LAST_NAME TIMESTAMP
goober:$ mkdir AGE
goober:$ echo "Oh crap, there goes my schema!"
"Oh crap, there goes my schema!"
goober:$ cd USER_ID
goober:$ ls
11023 11025 11044 11055 11092
goober:$ rm 11023
goober:$ echo "Wow! I hope that wasnt relational!"
"Wow! I hope that wasnt relational!"
goober:$ exit
Seriously, what type of integrity checking will be enforced in this filesystem?
I am betting that you either have robust integrity, which would give a completely counterintuitive file system, or lax integrity which would open the doors for all sorts of mischevious errors and data corruption.
The real strength of something like this would be in a corporate environment, where having a dbfs would simplify file management a great deal.
Umm... Berkeley DB and your favorite C compiler? ;)
This database filesystem might have a real advantage in terms of keeping the database records in a consistent state. Remember that in Unix files have arbitrary data. So a journal filesystem tends to keep the meta-data in a consistent state. They don't do much about the application data written into the files. However if mySQLfs had knowledge of the records being written, presumably it could do a lot of the cool stuff done in main frame OS's to ensure integrity. I don't know the VFS interfaces though, so I'm not sure if this is implementable under the current linux framework.
SQL is a standart language used to get information from ANY kind of database that supports it.
It allows you to get information from any database using a standart language. (Although many vendors like to tag on "Special" commands that make life easy if your using THEIR database product...)
Don't be confused by Microsofts "SQL Server" product which is basically Sybase's database with MS additions. Although even it can use SQL.
Now if only Oracle would release and ODBC driver for linux......
This is what happens when you discover that your co-worker has been posting crap as cyb0rq_m0nk3y, and then they feel that it would be funny to post their inane rant on my computer while in the restroom.
Makes us (the tewwetruggur contingency) look by far dumber than normal.
again, my apologies.
Hi! This is the Sig, blatantly attached to the end of this comment.
i just got used to doing it by hand, back on Be 4.
"One man's "magic" is another man's engineering."-- Robert A. Heinlein
I've never been thrilled with the performance of storing LOBs in any kind of DB -- Oracle, PostgreSQL, or MySQL. The plain-old filesystem tends to do it better and faster. I usually store the path to an object in the DB instead.
That being said, I have used the LOB stoage in Postgres to implement a versioning system for in-house work (and it worked well enough to prove to me that it's do-able, but not well enough to actually use). The concept is sound, but the implementation needs some work.
However, using a DB's LOB is a helluva lot better than using CVS for binary objects. CVS seems afflicted with unseemly memory bloat when checking in/out large binary objects...
Potato chips are a by-yourself food.
Would this MySQL-based file system be more like BeOS's file system, where files can have arbitrary attributes at the FS level, and you can query based on them?
Thanks
Bruce
Bruce Perens.
I was really happy to hear that, that's a very nice idea. But now I read here some comment which made me think that maybe some programs will start to save their configuration on this file system , just like windoze damned registry.
I really hope it won't happen, text file configuration files is da best imo.
For "real" db proposess it can be just great.
MySQL is basically an sql-interface to a flat file. A filesystem on top of a flat file? Gimme a break...
No offense to MySQL, but is it ready for such a task? Last I heard, MySQL didn't have record-level-locking except in some experimental forks. Are there any features lacking from MySQL that might make another database more appropriate (ignoring for the moment the license of them).
see IFS for an overview of oracles IFS.
or look at MS Vaporware Presentation to learn about ms plans in this area (powerpoint presentation unfortunately..)
the idea has definitely merit, and as others have pointed out, the possibilities are quite intriguing..
"Please do not feed the trolls."
Hell, I will anyway. WTF are you talking about??? GTK *flickers*? Since when? I used to have GTK apps on an old 40MHz 486 (and it was a DLC machine at that...a Cyrix 486 that plugged into a 386 mobo) and I didn't see said flickers *unless the app was poorly written, was doing animation, and didn't double-buffer.*
Bah, Troll Tech wrote somewhere? The PR department wrote an official release that said something along the lines of "we did not emulate the slow and flickery refresh of GTK(or was it gnome?)" Bullshit. Show me the link. Why would Troll Tech have a position on GNOME, anyway? They don't compete with GNOME in any way. They write a toolkit. I've seen some poorly-written QT programs that display flicker like all hell, and I've seen some GTK apps with decent animation. I've also seen well-written QT apps that display no flickering, and bad GTK apps that do. It depends on the app, I suppose.
Stating on Slashdot that I like cheese since 1997.
[*] really the class has other data structures besides the actual file data: e.g. file name, a field for comments about the file, etc., which may vary from class to class
There are also a variety of classes which serve as containers. The most obvious are what traditionally are directories or desktops. Another container class is "query", which has typical database search methods associated. These can be saved, copied, etc.
Imagine this: your command line should not be associated with a particular directory location, but rather a particular query. On the command line you most frequently use "cq" ("change query"), "rq" ("restrict query"), and "eq" ("expand query"). So to view the penguin image I know lurks somewhere on my drive, the sequence would be something like
% cq type=image ./penguin.gif
5037 files selected
% rq *pengiun*
2 files selected
% ls
pengiun_57.jpg
pengiun.gif
%
No default action for type "gif"; performing default action for type "image": opening penguin.gif with gimp...
(And, of course, there are obvious database sorts of features that any sensible graphical file explorer should have...)
To summarize:
(1) YES!!! Regardless of how exactly the system implements it, the filesystem should be interfaced as a database.
(2) Furthermore, don't view files just as RECORDS -- view them as active OBJECTS that are instances within a hierarchical class structure.
Finally, I think a lot of this can be done just with user interface, without having it explicitly in the filesystem. In fact, things have definately been moving this direction, at least for graphical file explorers. Has anyone added this sort of thing to a command shell?
All those complications that MySQL eschews are the sorts of things that would muss up the idea of viewing "database as FS hierarchy."
And as for the "locking" and "transactional" issues, the point is not terribly different. Filesystems generally don't provide ACID properties; neither does MySQL; that fits together well.
Mind you, it's quite possible that there's a much bigger controversy concerning stability; based on the MySQLFS web page, it appears that they're passing a CORBA IOR into the kernel. What can that possibly mean other than that they're assuming the presence of the "kORBit" implementation in the kernel? The flaming that surrounded "Why don't we try putting an ORB in the Linux kernel?" was much more vigorous than any flaming about MySQL lacking some ACID features! :-).
If you're not part of the solution, you're part of the precipitate.
I always thought a DB's indexing technique is the hole power of it. Mounting a DB as a fs just don't seem useful (in a speed concerning way) to me.
--
42 cows on a 42km road on their way to 42.org
This posting is a bit incorrect. On the web site, if you examine further this software will allow you to mount the database and get performance information (like /proc). Not use a database as a filesystem.
This is probably useful if you were writing software to analyze database performance or to get statistics.
First off, if you want something with serious DB features but without using SQL, you'd do well to just write a wrapper which adds/looks up entries in an SQL database but can be accessed without SQL. I don't know of anything like this existing right now simply because people who want serious database features (or who are writing a serious database) use SQL.
Well, almost.
You can also use ReiserFS -- particularly in a little while, after it impliments lightweight files (thus reducing the amount of overhead for eath record). Yup, ReiserFS has low-level support for relational storage, and lots of Other Cool Stuff. I understand that Squid has accelerated support for it; I've also seen a system for indexing newsgroup articles that uses Reiserfs as its backend. Roughly put, this is possible because of reiserfs's blazing speed when working with small files; it also has a plugin API (in-progress?) and Assorted Other Good Stuff.
the Be OS has had a database-like, journaling filesystem since it's very first release. it's the best of both a database, and a file system. I don't know what i would do without it. it makes sorting my thousands upon thousands of mp3s a snap. Add a CD the the collection, fill in the attributes for genre, album, year of release, and so on, and I have a fuly searchable collection.
"One man's "magic" is another man's engineering."-- Robert A. Heinlein
I assume that since this is using a corba interface one could just as easily mount a postgresql database or anything else that is exportable via corba.
Somehow this all reminds me of how Plan9 does things, where *everything* is a file, that includes TCP connections etc, etc...
Now what would be more useful is using corba as a replacement for NFS and SUN RPC. Now that might be of interest...
Well, but maybe you should.
Sure - a DB accessed as a filesystem doesn't present the full power of the DB through the filesystem API. And sure, a DB filesystem doesn't necessarily have the same performance characteristics as a standard filesystem.
But there are some very significant applications where a DB presented as a filesystem makes brilliant sense. Here's two simple ones off the top of my head.
Configuration management. Systems like CVS go to great trouble to get transactional behavior, so that you can't lose code if the program crashes in the middle of an update. If you're using a DBFS, you've got transactionality and rollback for free.
Micro-applications. There are a lot of simple applications which really need transactionality/rollback facilities, but which can't (either for portability or for size reasons) make use of a complete transactional database facility. Write it to access files, and let the database take care of transactions.
I don't have anything to do with this project, but I think it's a great idea, because I'm doing almost the same thing with DB2. (Why DB2? Because I work for IBM Research..) I'm building an SCM system, and I don't want the higher layers of my system to need to understand the database or the particular table layout that I'm using. So they access it as a filesystem; downbelow, it's a rock-solid database.
Of course, all of the above assumes transactionality - which is not yet fully supported by MySQL. So I'd be a little paranoid before using this, to make certain that they're using the transactional tables!
-Mark
Many people like to store binaries in the mysql databases, such as images. This would really help improve their ability to code this.
As PHP is used in conjunction with MySQL a lot, the functions like move_uploaded_file could be used to store blobs in the database rather than an insert into a blob field making your code much easier to read, but, of course, making the server setup a lot more complicated.
Without row level locking, however, you will face bottlenecks if you try to do anything besides a mostly read-only file system.
On the way home yesterday I commented to my friend we should use PHP and MySQL and maybe a TK interface for drag and drop to develop an encrypted filesystem.. for storage only. So that it was password protected and such. You could put files into it and store them in BLOBS. You get the point. Anyway it is really funny to look here today and see something very similar. They're coming to take me away haha!
IRNI
IRNI
SexCow Airlines
Let me go OT for just a second here: does anybody out there know of any open-source systems out there that can do large-scale data storage *without* SQL? I am thinking of a simple C/C++ API that you can use to retrieve and write data from/to tables/fields, nothing much fancier than that. So far, my best be seems to be ColdStore. Any other pointers?
This would be great for text based files and spreadsheets. The possibilities for searching and updating your files would be greatly enhanced by having them maintained by a database.
I don't think a database would be appropriate for graphics or music files(other than storing pointers to those files, but certainly any text based file would be ideal.
Given my thoughts on how a database enabled filesystem would work, I don't think very many joins or triggers would be necessary. Most things could be handled by single tables.
Besides, there is the matter that mySQL doesn't support foreign keys or triggers anyway, and last I checked those features weren't on the to do list. :)
No, Thursday's out. How about never - is never good for you?
Not just that, but for a robust DB, the commit and rollback operations are atomic. It either happens or it doesn't. No half-way measures. So if your disk crashes during a commit operation, you are guaranteed that either the operation went through or not. No mangling of the data.
Is where MySQL will use the raw hard drive/partition instead of one with a file system on it
I mean, it's an interesting idea, but if you need an FS, use an FS, and if you need a DB, use a DB. Mixing and matching, so you're accessing your DB with code that expects an FS, is a recipe for disaster. Especially when you're using a database like MySQL without commit or rollback (which database experts wouldn't even call a database, but there you go).
Still, reminds me somewhat of the DOOM FS I wrote many years ago (for another OS entirely) which accessed the WAD file using standard OS calls. That was cool ;)
Eeeew, it wasn't really THAT exciting...
Can someone compare this to the MySQL filesystem, or perhaps point me to a place where pgfs can still be downloaded?
I've been writing about something like this since about 9 months ago and like to give a big thumbs up to the MySQL team! Go to http://atoms.htmlplanet.com for finding out why using a database as filesystem could be the next hype in ITword!!
While this doesn't really seem very useful to me (SQL is after all good at what it does..), it seems silly to make it for just one database. It's easier to use common APIs (ODBC?), or at least something custommade but generic, and try to keep the SQL generic too (nothing fancy is needed for this sort of thing anyway) from the start. It's soo much harder to change after the fact. (Not that they said anything about this, but I assume that means it's as MySQL-specific as it can be..)
This is fantastic. I suppose it makes the previous article, on Msql, even Less important (not to take anything away from the minisql people).
The real highlights from the artice are these:
-Currently any new product which needs access to some data over network supports some protocols and possibily some way to access filesystems over the net. Data in MySQL table can be accessed in such systems even when MySQL is not ported to named platform.
-Backup and version control - ordinary filesystems can be backed up using any backup software. This data can be compared using diff and revision controled with cvs.
-Much shorter programming. People use database sometime for holding very simple data like current date or site name. This is single record, single column table which changes rarely.
Most people that have had to use mySql from time to time, can see instantly where one or all of these features come in to play instantly. I'm excited.
well, yeah.
s/cat/echo/ in my comment.
no, what i was saying was that you would "create a file" on the filesystem called "textquery". the SQL filesystem would then replace that with a directory of files representing records. or something. there are lots of different ways to do the details...
Perhaps it's on the list of free available databases?
Jacco /var/log
---
# cd
-------
Warning: Slashdot may contain traces of nuts.
Ehm?
Well.. it should always be possible to do it using the same SQL command... (in some cases some manual fiddleing with the data may be required though)
INSERT INTO sometable (blobfield) VALUES(<sumfin>)
Where <sumfin> is somewhat database dependant, but usually the interfaces takes care of whatever difference may be...
So... if your point is to be valid you should be against accessing ANY kind of database field using one of these interfaces... not?
('Cus accessing BLOB fields is really no more different from interface to interface than accessing any other kind of field!)
--
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
I have stored and retrieved data from BLOB fields... so I know that you can just "INSERT INTO sometable (blobfield) VALUES()" to insert data into a BLOB... ( can be somewhat different depending on database and interface, but this is always the (imho easiest) way to handle blob data...)
/comments.pl?sid=01/01/16/1855253&cid=278 too)
With this in mind it should be very easy to write a script that'll retrieve images from a database using just a SELECT... (db would be like "id (counter / primary key),imagedata (blob),mimetype (varchar(x)" and the script could be accessed as sumfin like getimage.pl?id=56, just making sure the correct mimetype is sent and then pumping out the image data...)...
I still don't see what is SO hard about BLOBs... they're really just as easy to handle as any other kind of field...
(please take a look at my other, fairly similar, comment at:
--
Ner lbh sebz gur HFN? Gura lbh'ir whfg ivbyngrq gur QZPN!
I'm probably being trolled here, but I must ask you, did you even read the article? They're not talking about replacing the file system with a database, or eliminating SQL access to the database. The idea actually is kind of interesting to me (it remains to be seen exactly how practical it is in real life). One of the cool things is that it would provide access to data to applications that only know the file system.
Honestly, I don't really know what you're complaining about.
-- It only takes 20 minutes for a liberal to become a conservative thanks to our new outpatient surgical procedure!
If a filesystem is a database is a filesystem and getting BLOBs out of a database is so hard, why don't you just store the pathname in the database and the image in the filesystem? I'm having a hard time envisioning storing images in a database as being "often necessary". But what do I know? :-)
I think its wrong to treat this direction of information storage as having anything to do with a file system. I prefer to call it an object system, as that paves the way for management of things other than files. I'm working on implementing something along these lines myself called a Meta Object Manager (MOM).
/etc/wtanksle/ppp.conf and /wtanksle/etc/ppp.conf or even just ppp.conf, if that focus resolves to one object. In the future, getting a list of all configuration files on your system could as simple as "cf conf; ls". Once the abstraction of a file system to an object system is made, though, I don't see any going back.
It takes a different view than you're saying (although it's very much in line with what you're thinking) by simply, for file objects, assigning them methods and attributes which can be used to organize them. It does this at the object level, not with categories or classes or any other hierarchical restrictions. To reference an object, you specify the attributes necessary to bring it in focus (an object system "change focus" as opposed to a file system "change directory").
The command line implementation (working on the GUI now) has come pretty far, and you can access the same object with
You can give just about anything a filesystem interface, its just a matter of how good the implementation is and how useful it is.
/mnt/mysql/queries/testquery" and then looking in /mnt/mysql/queries/testquery/ for the result set, for example).
There have already been even FTP and HTTP filesystems for several operating systems if memory serves, and I know there have been a couple other odd ones for BeOS. I nearly did a database FS for Be a few years ago myself.
Speaking of which, this would be much easier to implement (filesystems are simple to write) and more useful IMHO (because there are already standard APIs to query filesystems and support any number of attributes for files at the OS-Filesystem level) to do for BeOS.
I'm sure it can be done for linux too, but I have doubts as to the usefulness of it under any OS, much less one where you don't have the luxury of being able to utilize existing attribute support.
It might give some shortcuts for reading, but writing will likely be very complicated. I don't see a good way to do anything along the lines of joins either. The idea of using "." files/directories will help provide some of that I suppose. Permissions will also be a problem, though I guess you could just go by login name.
A good reason to have filesystem interfaces to complex resources (like FTP, HTTP, databases, etc) is that it is easy to access things on a filesystem from within just about every programming language on every platform. However, by forcing the normal interfaces to those resources down into what can be done to a filesystem some things also become very complicated. To do those more complicated things will either mean complicated interfaces or programs that give the filesystem information through some other means (ioctl?) or perhaps writing commands to a file within that filesystem ("cat 'SELECT blah FROM blahtable' >
In short, I'm sure it'll be very fun to implement and be an interesting toy which may even have some uses...
Imagine a dynamic web site that uses this! You could simply copy files (especially graphics files) to/from a table easily and look them up via SQL queries! My goodness, the usefulness is extreme, people.
Have any of you (fs!=db) nay-sayers ever tried to store/retrieve GIFs and JPEGs in a relational database for a web site -- an often daunting, but often necessary task? There are whole article on my to store/retrieve pics as BLOBs via MySQL/PHP on PHPBuilder.com: http://www.phpbuilder.com/columns/florian19991014. php3 and (sorta) http://www.phpbuilder.com/columns/bealers20000904. php3
So, for those of you who can't get over this idea, try doing sites that store images in databases sometime. An idea like this (one being done by the big RDBMs -- and I work for one of those) is a BOON for websites. It also has many other applications.
A layer of abstraction is often a good thing for filesystems, and it's where things are headed. IMHO, I think db's could provide BETTER security and make things more distributed, rather than current filesystems. Imagine whole new networked filesystems that are distributed databases. Open your mind. Think about it hard before brushing it aside.
Besides a db is an fs is a db. It depends on how you look at it, your definition, etc. Is a filesystem relational? Does a db use local storage, often RAW storage. The true computer definition of the two is not all that different. And, SQL is not the only query language out there. Haven't you heard of CLI, which uses commands like cat, ls, echo, rm, mv to handle data? What about those relationships called directories?
I say, what's the real issue? Raw speed? Oh, wah! Grow up and join the enterprise! Oops, I guess the AS/400 must not be a viable platform; they've been doing this HOW LONG?!?!?
Q: When are we Linux/Open Source people going to get enterprise-level file and storage management?A: When we get to the point that we implement at least a JFS (if not a full-fledged logged filesystem, good logical volume management, real uninterruptible power, truly fault-tolerant hardware/software clustering, better security, and fully distributed storage management that backups and versions data automagically.
On a lighter note, MySQL now implements a filesystem. :-)
Oracle has a similar project going. Having a DB as a filesystem is just way cool. The speaker from Oracle (at LinuxExpo 2000 Toronto) said that writing files to a dbfs(?) is slower, but retrieval is mich quicker.
The real strength of something like this would be in a corporate environment, where having a dbfs would simplify file management a great deal.
my 2c
DOS is dead, and no one cares...
DOS is dead, and no one cares...
If there's a Bourne Shell, I'll see you there