While performance is something that should always be kept in mind, we are a long way away from the days of the original Macintosh where a desk accessory had to weigh in at 600 bytes in order to make the cut and fit into both memory and on a floppy disk. As current desktop machines outperform the high end servers of a few years ago, it would be nice to put a lot of that muscle to use in improving the user experience. I'm not excusing bloated and slow code here, but we don't really need to be counting bytes.
In any case, database based operating systems have been around for decades, from OS/400 to the BeOS. Many BeOS users claimed it was hands down faster than any other shipping OS at the time, and it featured a journaling, database-styled file system. One of the primary developers of that file system is now working at Apple on Mac OS X 10.4's spotlight functionality.
The thing is - as our desktop storage continues to grow at the pace that it does, and as we curiously find ways to fill it up, new ways of looking at and finding the information we store are going to be needed.
DBFS, Gnome Storage, Apple's Spotlight, and WinFS, all take different routes to get there. It's worth looking at all and what they offer and where they differ. WinFS, is a new storage layer that combines file system resources with more structured data in a Relational/XML hybrid system, with the aim (from what I gather) of turning the file system into a global "soup" of data. That sort of soup can be seen in office suites or PDA style applications, and in older Operating Systems like the Newton OS, where everything is a shared and available resource that is stored and available through common structures. Spotlight, on the other hand, combines file system searches and indexes (think 'locate') with full content indexes and a metadata index, which uses 'importers' to parse out other file formats. Spotlight is not a new file system, but an indexing system that acts on files in the file system. From what I remember of Gnome Storage, it is similar, using the VFS layer and Postgres triggers and callbacks, along with plug-ins, to parse and extract relevant metadata and contents out of files. DBFS looks to be like WinFS in that it purely wants to be a new kind of information store. I don't know which style will win out. My theory is that technologies like Spotlight will eventually evolve into a new kind of storage system, while remaining familiar and file based for todays users and developers. But this is an idea whose time has more than come. It's something that's been promised for the desktop for at least a decade, and has been shown to work, albeit in targeted OS's (the Newton) or ones that never achieved mass market penetration (BeOS).
So I think that performance concerns aren't that big of a concern, so long as (like all development) there are good people working on the solution.
I've tried looking at Ruby. Sadly, the latest version doesn't compile easily on Mac OS X and I haven't taken the time to figure out why. But from my brief looks at the Programming Ruby book and at some of the minimal documentation availabe in English, these are my thoughts:
Cons
Lack of English Documentation -- Ever since I got in to Python (Py 1.3), the tutorials and library/module references have been great. They're pretty well organized and I can usually find an answer to a question pretty quickly. With Ruby, I had a harder time navigating the web site and finding comprehensive up to date information in English. note: as I was writing this, I found some better documentation. but getting to it was still non-obvious, and much of it was still pretty "light" on fuller descriptions of some cool looking features like Singleton Classes
Some Python 'issues' are 10 times worse for Ruby -- Some of the issues raised in these thread - lack of CPAN, etc... - are even bigger for Ruby. I don't know how long Ruby's been around, but it feels like a much younger language. They claim, however, that it's bigger in Japan than Python is. The number of built in and available modules for Ruby is still substantially smaller than for Python.
Too perlish in some areas -- Where this may be a pro or con depends on where you stand on String literals.:) Ruby has a fair amount of string modifiers to affect quoting of strings and regular expressions. Python has some string modifiers (ie, "raw" strings like foo = r'\bbank\b'), but things like Regular Expressions are made as objects. As for functionality of Regexes - Python, Perl, and Ruby all offer basically the same features, but in Python you might have to be a bit more wordy (not in the regex itself, but how you use match results, do substitution, etc). Personally, I prefer Python's way because I work on large systems and frameworks. The Perl and Ruby way work nice for massive text processing/shell scripting, but I've had a hard time maintaining them. Another example of Ruby as a shell scripting language similar to Perl is the use of backticks to spawn a sub-process, or its own %x/STRING/ expression. (ie: currdate = `date`). While this is a powerful feature, I think it makes the language in general more awkward for building large systems. In Python, this sort of functionality is handled by modules like os, a generally platform agnostic module around native system calls (on Unix, this is usually the posix module).
Pros
Closer to OO Purity -- Ruby is really closer to Smalltalk than it is to Python, without the stranger Smalltalk syntax and VM needs. Ruby's object system boasts the following features over Pythons:
No type/class dichotomy -- this will soon be done away with in the Python core, but currently there are still differences in Python between the object based Types (strings, lists, ints, etc.) and Classes. You still can't subclass from String in Python, for example. (However, in Python 2.0, Strings finally got their own methods). Python does put considerable effort into letting you write a class that smells like a string, list, dictionary, etc. Note: Zope has always used something call Extension Classes, allowing C based extension classes to be subclassable. This has allowed Zope to add some strong features to the object model at a core level, like Persistence and Acquisition. So Ruby, like Smalltalk and (most of) Java, has a unified object tree. Python doesn't (presently).
Better encapsulation/access control -- Python doesn't do strong enforcement of private class members, although Python 1.4 came out with decent a pretty decent solution to let you make pseudo-private class members.
super! -- As I said above, I work with large systems and frameworks in Python. In the current generation of the system, use of multiple inheritance has gone unchecked and ultimately has run wild. This makes doing things like letting your object handle an event tricky, because you should only really be handling what your subclass is responsible for, and then passing the buck along. In Python, this is usually done by calling a method on a parent class directly and passing self in as the argument. Well, this is all well and good if you know what class to call. Sometimes in large frameworks where mixing in using multiple inheritance has run wild, you don't know which class really does the behavior you need to do. Ruby gives you super. Python doesn't.
And speaking of mix-ins... -- I gleam that Ruby is a single inheritance language. Now, normally I love multiple inheritance but lately it's been really biting me in the ass. Usually, one wants to really subclass from one major object, and then mix in functionality. It looks like Ruby allows this by somehow mixing in module level functionality into a Class or instance.
Accessors -- This may fall into a syntatic sugar territory... But maybe not. In Python, writing an accessor means writing a method like setAge(age) and age() for a class, and then using them accordingly like bob.setAge(25) and bob.age(). But in Ruby, you can define a method like age=(a), and another one called simple age (done via a def age statement that returns the instance variable age). This doesn't look like anything fancy. In Python (and I assume Ruby), it's just as easy to arbitrarily set attributes on an object and read them back without accessors. BUT!, in larger systems/frameworks where things like persistent modules may change, or rules are associated with getting/setting an attribute (ie, you might want to raise an exception if someone tries to set age greater than 150), you have to use methods to get/set values. This is no big deal in Python, but I like how Ruby makes it look more natural. (In Python, I could have also overridden __getattr__() and __setattr__() to provide similar behavior - the freedom exists in Python to do what I want here, but it's not the default)
Yield, Retry, and Blocks -- There are non standard versions of Python (ie, stackless) that have some of these features. But Python doesn't have blocks. Blocks are cool, and a very powerful feature in Smalltalk. They're basically anonymous functions. Python has lambda:, but it's limited to being what can be held in a Python expression - you can't use full statements. Python 2.1, however, makes some nice inroads on this bullet point with it's new scoping rules (finally!) and List comprehensions.
So, ultimately, my take on Ruby is that I really like it's object model. Basically, everything that it borrows from Smalltalk has been done cleanly and elegantly in a syntax that's less foreign that Smalltalks. On the other hand, it's also really trying to be a bizzarre marraige of Perl and Smalltalk. And I don't know how well that works with large systems, or for embedding - there's too many shell-scripting level games. I think Python does the right thing by giving access to shell-scripting level features through modules, allowing an embedded Python to more selectively enable and disable features by choosing which modules to expose. If you think of the term "scripting language" to mean something like shell scripts or awk/perl, Ruby is a very nice language with a lot of so called "elegant" features built in. At the core, it's a very simple language. But if you think of Scripting Language to mean "glue language", I think Python is way ahead here. While Python is definitely usable for shell scripting, it made the right choice in making that functionality available in modules instead of being a core language feature. As a result, Python's being embedded in UML Tools (Object Domain), 3D animation environments (Caligari Truespace. Python's also used heavily in many 3D/effects labs like Lucasarts, and whomever did the effects for Alien Resurrection), Flight Simulators (Fly! II uses Python to script "scenarios", allowing people to write their own custom ones), Image processing (used as a replacement for more cryptic command language used to process astronomy pictures), etc... I don't know how easy it would be to allow Ruby to fit in many of those environments. Python has also proven itself many times as a valuable rapid development language - the main features of Google were explored first in Python before being migrated to C for speed. I imagine Ruby could offer similar benefits here though.
For Small Things (ie, shell scripts), I think things are pretty balanced between the two languages. Ruby has a lot of features familier to Perl users in what can be done with Strings, shell calls, etc. Python on the other hand comes with a pretty large set of well documented modules and objects that can be used "out of the box".
For medium sized programs, I think Python is the better language - you don't have to worry about encapsulation and other OO features that much as you move into using modules and classes, and you get something very usable very quickly. It's really easy to have a program start it's life out as a script in Python and move it up to a full package of modules and classes by applying some simple refactoring rules. In Ruby, I have a harder time seeing the lines of where to do this.
For very large systems that can still be achieved in languages like this (you'd be surprised at the size of some of the private Python programs in use), Python starts to break down. A lot of this is due to lack of formalized interfaces and contracts (There's a PEP on this, and some of these features might show up in the next release of Python), the class/type dichotomy, and lack of enforcement on encapsulation. In some cases, it's really cool that by default you can stick any arbitrary attribute on an object in Python. But eventually, this catches up to you. Also, multiple inheritance trees that grow unchecked can be quite a pain to deal with. Ruby's object model looks like it would be a better fit here. But Python is improving in this area. Things like Unit tests (pyunit now ships with the Python Core), Interfaces (if that change proposal gets accepted and integrated), and hopefully some sort of extension class like behavior are all there now or are coming. And they should help with dealing with the integrity issues that can plague very large Python projects.
While performance is something that should always be kept in mind, we are a long way away from the days of the original Macintosh where a desk accessory had to weigh in at 600 bytes in order to make the cut and fit into both memory and on a floppy disk. As current desktop machines outperform the high end servers of a few years ago, it would be nice to put a lot of that muscle to use in improving the user experience. I'm not excusing bloated and slow code here, but we don't really need to be counting bytes.
In any case, database based operating systems have been around for decades, from OS/400 to the BeOS. Many BeOS users claimed it was hands down faster than any other shipping OS at the time, and it featured a journaling, database-styled file system. One of the primary developers of that file system is now working at Apple on Mac OS X 10.4's spotlight functionality.
The thing is - as our desktop storage continues to grow at the pace that it does, and as we curiously find ways to fill it up, new ways of looking at and finding the information we store are going to be needed.
DBFS, Gnome Storage, Apple's Spotlight, and WinFS, all take different routes to get there. It's worth looking at all and what they offer and where they differ. WinFS, is a new storage layer that combines file system resources with more structured data in a Relational/XML hybrid system, with the aim (from what I gather) of turning the file system into a global "soup" of data. That sort of soup can be seen in office suites or PDA style applications, and in older Operating Systems like the Newton OS, where everything is a shared and available resource that is stored and available through common structures. Spotlight, on the other hand, combines file system searches and indexes (think 'locate') with full content indexes and a metadata index, which uses 'importers' to parse out other file formats. Spotlight is not a new file system, but an indexing system that acts on files in the file system. From what I remember of Gnome Storage, it is similar, using the VFS layer and Postgres triggers and callbacks, along with plug-ins, to parse and extract relevant metadata and contents out of files. DBFS looks to be like WinFS in that it purely wants to be a new kind of information store. I don't know which style will win out. My theory is that technologies like Spotlight will eventually evolve into a new kind of storage system, while remaining familiar and file based for todays users and developers. But this is an idea whose time has more than come. It's something that's been promised for the desktop for at least a decade, and has been shown to work, albeit in targeted OS's (the Newton) or ones that never achieved mass market penetration (BeOS).
So I think that performance concerns aren't that big of a concern, so long as (like all development) there are good people working on the solution.
I've tried looking at Ruby. Sadly, the latest version doesn't compile easily on Mac OS X and I haven't taken the time to figure out why. But from my brief looks at the Programming Ruby book and at some of the minimal documentation availabe in English, these are my thoughts:
Cons- Lack of English Documentation -- Ever since I got in to Python (Py 1.3), the tutorials and library/module references have been great. They're pretty well organized and I can usually find an answer to a question pretty quickly. With Ruby, I had a harder time navigating the web site and finding comprehensive up to date information in English. note: as I was writing this, I found some better documentation. but getting to it was still non-obvious, and much of it was still pretty "light" on fuller descriptions of some cool looking features like Singleton Classes
- Some Python 'issues' are 10 times worse for Ruby -- Some of the issues raised in these thread - lack of CPAN, etc... - are even bigger for Ruby. I don't know how long Ruby's been around, but it feels like a much younger language. They claim, however, that it's bigger in Japan than Python is. The number of built in and available modules for Ruby is still substantially smaller than for Python.
- Too perlish in some areas -- Where this may be a pro or con depends on where you stand on String literals.
:) Ruby has a fair amount of string modifiers to affect quoting of strings and regular expressions. Python has some string modifiers (ie, "raw" strings like foo = r'\bbank\b'), but things like Regular Expressions are made as objects. As for functionality of Regexes - Python, Perl, and Ruby all offer basically the same features, but in Python you might have to be a bit more wordy (not in the regex itself, but how you use match results, do substitution, etc). Personally, I prefer Python's way because I work on large systems and frameworks. The Perl and Ruby way work nice for massive text processing/shell scripting, but I've had a hard time maintaining them. Another example of Ruby as a shell scripting language similar to Perl is the use of backticks to spawn a sub-process, or its own %x/STRING/ expression. (ie: currdate = `date`). While this is a powerful feature, I think it makes the language in general more awkward for building large systems. In Python, this sort of functionality is handled by modules like os, a generally platform agnostic module around native system calls (on Unix, this is usually the posix module).
ProsCloser to OO Purity -- Ruby is really closer to Smalltalk than it is to Python, without the stranger Smalltalk syntax and VM needs. Ruby's object system boasts the following features over Pythons:
So, ultimately, my take on Ruby is that I really like it's object model. Basically, everything that it borrows from Smalltalk has been done cleanly and elegantly in a syntax that's less foreign that Smalltalks. On the other hand, it's also really trying to be a bizzarre marraige of Perl and Smalltalk. And I don't know how well that works with large systems, or for embedding - there's too many shell-scripting level games. I think Python does the right thing by giving access to shell-scripting level features through modules, allowing an embedded Python to more selectively enable and disable features by choosing which modules to expose. If you think of the term "scripting language" to mean something like shell scripts or awk/perl, Ruby is a very nice language with a lot of so called "elegant" features built in. At the core, it's a very simple language. But if you think of Scripting Language to mean "glue language", I think Python is way ahead here. While Python is definitely usable for shell scripting, it made the right choice in making that functionality available in modules instead of being a core language feature. As a result, Python's being embedded in UML Tools (Object Domain), 3D animation environments (Caligari Truespace. Python's also used heavily in many 3D/effects labs like Lucasarts, and whomever did the effects for Alien Resurrection), Flight Simulators (Fly! II uses Python to script "scenarios", allowing people to write their own custom ones), Image processing (used as a replacement for more cryptic command language used to process astronomy pictures), etc... I don't know how easy it would be to allow Ruby to fit in many of those environments. Python has also proven itself many times as a valuable rapid development language - the main features of Google were explored first in Python before being migrated to C for speed. I imagine Ruby could offer similar benefits here though.
For Small Things (ie, shell scripts), I think things are pretty balanced between the two languages. Ruby has a lot of features familier to Perl users in what can be done with Strings, shell calls, etc. Python on the other hand comes with a pretty large set of well documented modules and objects that can be used "out of the box".
For medium sized programs, I think Python is the better language - you don't have to worry about encapsulation and other OO features that much as you move into using modules and classes, and you get something very usable very quickly. It's really easy to have a program start it's life out as a script in Python and move it up to a full package of modules and classes by applying some simple refactoring rules. In Ruby, I have a harder time seeing the lines of where to do this.
For very large systems that can still be achieved in languages like this (you'd be surprised at the size of some of the private Python programs in use), Python starts to break down. A lot of this is due to lack of formalized interfaces and contracts (There's a PEP on this, and some of these features might show up in the next release of Python), the class/type dichotomy, and lack of enforcement on encapsulation. In some cases, it's really cool that by default you can stick any arbitrary attribute on an object in Python. But eventually, this catches up to you. Also, multiple inheritance trees that grow unchecked can be quite a pain to deal with. Ruby's object model looks like it would be a better fit here. But Python is improving in this area. Things like Unit tests (pyunit now ships with the Python Core), Interfaces (if that change proposal gets accepted and integrated), and hopefully some sort of extension class like behavior are all there now or are coming. And they should help with dealing with the integrity issues that can plague very large Python projects.