Why IBM Open Sourced Cloudscape
An anonymous reader writes "A common and a consistent framework for accessing information enables developers to do more things with more people more often. This article shares how Derby fits into IBM's developer strategy, the Java application stack, its intention to drive more innovation around Java on Linux, and why they want to make the Derby database become as ubiquitous as the Apache HTTP server." (Derby is the new name for the project based on the formerly commercial Cloudscape database.)
With more and more small form factor devices that run Java (Sharp Zaurus PDA, HomePod media player, various set top boxes) and even Java processors (aJile, for example), a lightweight database presents some nice application opportunities.
I've played with Cloudscape before and it's not as speedy as MySQL or as rugged as Oracle, but it does get the job done. And having a relational database right in the set top box or PDA means independence from a more heavy duty machine on the LAN, WiFi, etc.
Open source is just icing on the cake.
IBM cannot just open-source OS/2. There are technologies and copyrights in OS/2 that belong to third-parties (such as Microsoft).
OS/2 is still available and developed as eComStation http://www.ecomstation.com/. I have to say that I think that it is very expensive, on the other hand it is far from dead.
Derby seems to be more of a traditional database, in comparison.
Microsoft largely wrote OS/2 1.x according to IBM's specifications. IBM was responsible for writing OS/2 2.x. This includes the entire Workplace Shell, which started life as a shell for 1.x, actually: I saw it demo'ed in 1998 running on 1.x. This was an early demo: even window resize did not work! :)
While IBM was working on 2.x (mainly WPS stuff), Microsoft was tasked with writing the next version of OS/2: 3.0. It was about this time that Windows 3.0 became such a success. Microsoft then too their OS/2 3.0 code and decided to make Windows NT.
That is why an *amazing* number of Win32 (as in NT, *not* 95) calls are merely renamed OS/2 calls. In fact, IBM ported Lotus SmartSuite to OS/2 by creating a Win32 (again, NT, not 95) to OS/2 translation layer that allowed them to port like 85% of the SmartSuite code without rewriting.
Windows 95 was not even *thought* of at this time. We're talking 1991 timeframe. Windows 95 was never supposed to exist: NT (NT 3.1, that is) was supposed to be the 32-bit OS that the world moved to. But it was too big, too bloated, too unusable.
However, there is (well, was) a *ton* of code that started life as OS/2 3.0 in Windows NT. That's why during the divorce, IBM was given the rights to a source license of Windows 3.1. Which is also why shortly afterward Microsoft release Windows 3.11! :) In IBM's "Blue Spine" version of OS/2 (the one that included Windows 3.1), IBM's copy of Windows 3.1 ran 10% faster than Microsoft's. Why? They recompiled Windows with the Watcom C compiler instead of MSVC! :)
However, it's all kind of moot, anyway. The Win32 API is now quite a bit different (Windows 95's 'Win32' API was quite a bit different from NT 3.5's, and Windows 2000 and XP have moved in new directions, too), and OS/2 isn't going anywhere.
Linux IT Consulting and Domino Development in Michigan
We use both HSQL and Derby and our experience was that while HSQL was great for small databases, it started to become impractical for medium-to-large databases. Just doing a SELECT Count(*) FROM Foo (which should be instant) can take 30 seconds or more on a large table. Also, if you do a lot of updating (incrementing statistics records, for instance) the table size can get out of hand quickly since each update effectively adds a new record to the table file (until you compact it).
:
Here are some preliminary notes one of our engineers compiled while investigating adding Derby to our project. They were just preliminary notes so I make no guarantees as to accuracy but they might be helpful...
CHAR/VARCHAR/LONG VARCHAR
Derby strictly enforces the size specification in CHAR and VARCHAR fields. CHAR fields are space extended; non-space data the does not fit in the field raises an exception on insert or update. LONG VARCHAR data cannot be ordered, grouped, or indexed. (Really!) I believe that SQLServer (and possibly MySQL) has these stupid limitations, too. It may go all the back to the SQL-92 spec. HSQLDB, on the otherhand, ignores all size specifications, treating CHAR/VARCHAR/LONG VARCHAR as synonymns for java.lang.String.
TOP/LIMIT
Derby does not support the TOP or LIMIT syntax. There appears to be a "FIRST n ROWS ONLY" syntax that was added to DB2 that never found its way into Cloudscape.
Case sensitivity
Derby appears to treat all columns as case sensitive; and there appears no way to change this. HSQLDB, on the otherhand, can be configured on a field-by-field basis. (SET IGNORECASE is used for the database default; and VARCHAR_IGNORECASE is used as the data declaration.)
IDENTITY fields
Derby uses the bizarre syntax GENERATE ALWAYS AS IDENTITY. This also does not imply that the field is a primary key. So, "IDENTITY" in HSQLDB becomes "GENERATED ALWAYS AS IDENTITY PRIMARY KEY". Derby allows specification of initial value and increment.
GENERATE ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 2)
Performance
Derby is nearly instantaneous for COUNT(*) queries on databases with large number of rows. HSQLDB appears to count the rows, resulting in very poor performance. Derby appears to have a better architecture for large databases. Queries seem to run in time proportional to the size of the result set. Many simple HSQLDB queries run in time proportional to the size of the database.
CHECK constraints
Derby supports CHECK constraints, e.g.,
size INTEGER DEFAULT 0 NOT NULL CHECK (size >= 0)
disposition CHAR(1) DEFAULT '+' NOT NULL CHECK (disposition IN ('+', '-', 'B', 'M', 'Q'))
FOREIGN KEY constraints
Derby supports inline foreign key declarations with implied column matching, e.g.,
smtpID CHAR(17) NOT NULL REFERENCES InboxEvents ON DELETE CASCADE
HSQLDB requires table-level contraints with explicit column matching:
FOREIGN KEY (smtpID) REFERENCES InboxEvents (smtpID) ON DELETE CASCADE
Cheers,
Brien Voorhees
Red Condor
Corporate anti-spam gateway service for less than $2/user/month