Slashdot Mirror


Why IBM Open Sourced Cloudscape

An anonymous reader writes "A common and a consistent framework for accessing information enables developers to do more things with more people more often. This article shares how Derby fits into IBM's developer strategy, the Java application stack, its intention to drive more innovation around Java on Linux, and why they want to make the Derby database become as ubiquitous as the Apache HTTP server." (Derby is the new name for the project based on the formerly commercial Cloudscape database.)

21 of 108 comments (clear)

  1. Re:All this talk... by madman101 · · Score: 3, Informative

    Because too much of the underlying code is owned by Microsoft.

  2. Re:All this talk... by pix · · Score: 5, Informative

    IBM cannot just open-source OS/2. There are technologies and copyrights in OS/2 that belong to third-parties (such as Microsoft).

    OS/2 is still available and developed as eComStation http://www.ecomstation.com/. I have to say that I think that it is very expensive, on the other hand it is far from dead.

  3. I Know One Bug That Needs Fixed ASAP... by Black-Man · · Score: 4, Informative

    Sometimes on re-start the db process just hangs and you can't connect.

    You have to blow away the dbcache directory to get it to start-up. It doesn't occur frequently, but it has happened more than once in an otherwise stable environment.

    1. Re:I Know One Bug That Needs Fixed ASAP... by Cobron · · Score: 2, Informative

      Perusing the docs of Derby it says somewhere that no garbage collection on the connection will be performed until all references to the connection are gone. You shouldn't close the connection once opened (it says), so perhaps the gc isn't run yet after a reboot? Perhaps try running the gc manually on startup? (I don't know if "perusing" is spelled or even used correctly, but I couldn't miss the opportunity to try that word out ;) )

  4. The real reason they did it by codepunk · · Score: 3, Informative

    Remember that little deal a while ago about ibm building some off line web technology that auto syncs when you regain a connection?

    The technology we are talking about is called App Play and guess what it uses for data syncronization?

    It does not matter if they open sourced it since they where going to be puttting it on tons of clients anyhow.

    --


    Got Code?
  5. Re:compatibility? by orasio · · Score: 2, Informative

    It requires an implementation of the open (by now) Java specification.
    Whether you use a free implementation or a proprietary, it's your problem. There could be trouble finding a complete free Java implementation, but the GCJ team is working on it.

  6. Re:compatibility? by pix · · Score: 2, Informative

    Actually, being IBM code it was probably developed using IBM's JVM not Sun's.

  7. Re:Cloudscape, er Derby, is good stuff by spookymonster · · Score: 5, Informative

    SQLite isn't written in Java; it's C++. The code may be platform-independent, but the binaries it produces aren't.

    A fairer comparison would be Hypersonic SQL, a free, open-source small (~100K) database server.

    --
    - Despite popular opinion, I am not perfect.
  8. Gump is starting to do nightly builds on OSS by steve_l · · Score: 3, Informative
    If you look at gump, you will see that apache are starting to do nightly builds of all the main OSS Java projects on the Kaffe/classpath/gcj toolchain.

    Cloudscape is a long way down the dependency graph, and you shouldnt expect it for a while. We need to get ant to boot first, which is seemingly a compiler problem.

  9. Hmm by Bill,+Shooter+of+Bul · · Score: 3, Informative

    Due to this much of OS/2 is in NT and much of NT is in OS/2, which is why OS/2 could run Windows 3.1 apps natively without and user intervention. OS/2 had a Win3.1 VM that worked so well Microsoft had to implement Win95/NT 4.0 style API's to break the compatibility.

    No, none of NT was in OS2. Nor is any OS2 in NT. That was one of the reasons for creating NT. There is Win 3.1 in OS2 in the VM that you mentioned, but I hardly think that played much of a decision in creating a 32 bit WIN API. After the success of win 3.1, Microsoft realised that it could succed with out OS2 or IBM. So it made win 3.1 32 bit and created win 95 until NT was ready for mainstream use.

    --
    Well.. maybe. Or Maybe not. But Definitely not sort of.
    1. Re:Hmm by tmasssey · · Score: 5, Informative
      This is incorrect.

      Microsoft largely wrote OS/2 1.x according to IBM's specifications. IBM was responsible for writing OS/2 2.x. This includes the entire Workplace Shell, which started life as a shell for 1.x, actually: I saw it demo'ed in 1998 running on 1.x. This was an early demo: even window resize did not work! :)

      While IBM was working on 2.x (mainly WPS stuff), Microsoft was tasked with writing the next version of OS/2: 3.0. It was about this time that Windows 3.0 became such a success. Microsoft then too their OS/2 3.0 code and decided to make Windows NT.

      That is why an *amazing* number of Win32 (as in NT, *not* 95) calls are merely renamed OS/2 calls. In fact, IBM ported Lotus SmartSuite to OS/2 by creating a Win32 (again, NT, not 95) to OS/2 translation layer that allowed them to port like 85% of the SmartSuite code without rewriting.

      Windows 95 was not even *thought* of at this time. We're talking 1991 timeframe. Windows 95 was never supposed to exist: NT (NT 3.1, that is) was supposed to be the 32-bit OS that the world moved to. But it was too big, too bloated, too unusable.

      However, there is (well, was) a *ton* of code that started life as OS/2 3.0 in Windows NT. That's why during the divorce, IBM was given the rights to a source license of Windows 3.1. Which is also why shortly afterward Microsoft release Windows 3.11! :) In IBM's "Blue Spine" version of OS/2 (the one that included Windows 3.1), IBM's copy of Windows 3.1 ran 10% faster than Microsoft's. Why? They recompiled Windows with the Watcom C compiler instead of MSVC! :)

      However, it's all kind of moot, anyway. The Win32 API is now quite a bit different (Windows 95's 'Win32' API was quite a bit different from NT 3.5's, and Windows 2000 and XP have moved in new directions, too), and OS/2 isn't going anywhere.

  10. Re:vs HSQL? by Anonymous Coward · · Score: 2, Informative


    How does Cloudscape/Derby compare with the other open source Java database engine, HSQL?


    The big feature that Cloudscape has that I don't see on the HSQL page is XA support. Uninteresting unless you are working with a TM, but when you are XA can be the difference between "this could be made to work" and "this is a non-starter"

  11. Re:My reason for not using it by the+quick+brown+fox · · Score: 2, Informative
    You don't need to pay for commercial usage. It uses the Apache license.

    explanation in plain english

  12. Re:vs HSQL? by Anonymous Coward · · Score: 5, Informative
    I've found that derby is much slower than hsqldb. However, that depend on which way you use hsqldb. hsqldb is able to run completely in memory, with the on-disk database being just a list of the sql statements you issued. When I run hsqldb this way, it runs very fast, but that doesnt seem too scalable to me.


    Derby seems to be more of a traditional database, in comparison.

  13. Re:It is DB2 -- in a way ... by Eric+Giguere · · Score: 2, Informative

    True, the SQL syntax for Cloudscape 10.0 is apparently a subset of the DB2 syntax. So there's definitely a migration path there, which is good for DB2. So the focus is still on DB2. This is one way to move customers over to DB2 if Derby doesn't meet their requirements. Similar to what Microsoft has done with MSDE (now SQL Server Express), except that the IBM way is arguably much friendlier since they've open sourced the codebase instead of just allowing free redistribution of Windows-only binaries.

    Eric

  14. Re:All this talk... by tmasssey · · Score: 4, Informative
    None of this has been true since 1998 at the *latest*, and some of these haven't been true since 1992!

    OS/2 2.0 was a fully 32-bit, reentrant, fully preemptive multitasking kernel in 1992. Linux still has issues with a preemptive kernel! The graphics interface went 32-bit in OS/2 2.1. It is a single-user system, so there is (or was, anyway) little focus on multi-user style security, at least for local users (the HPFS, and especially HPFS386 filesystems were excellent for multi-user security, including full support for extended attributes).

    As for the single program locking up the entire system, that was a design decision in the Presentation Manager (the GUI API and program). It had a single input queue: all window messages went through a single queue. This has performance and usability advantages, especially when one window must modify or handle the messages for another.

    However, yes, a single program that did not respond to messages could lock the GUI. The computer would run, but the GUI would be locked until you killed it.

    That was changed in Warp 4.0. There were a number of user selectable ways that this could be addressed, depending on how much you might need the features of SIQ.

    I'm not saying that OS/2 is perfect, or even valuable in the year 2004, but give me a break. You're talking about issues that were addressed between 6 and *12* years ago!

    And the Workplace Shell features a level of object orientedness I have never experienced anyplace else, one that worked *extremely* well. The GUI was not pretty, but it was extremely robust, with a collection of very powerful features.

  15. Derby seemed like a step back... by Muad'Dave · · Score: 3, Informative

    ...compared to hsqldb for my purposes. Hsql supports persisting Java objects directly into an Object-type column [preparedStatement.setObject(obj)]. Derby requires that you persist your object manually and stuff it into a (statically-sized) BLOB by manipulating streams - ick!

    Also, hsql allowed ps.setObject(1, null) as a shortcut to ps.setNull(1, Types.). This was really handy.

    It _looks_ like derby 10 claims JDBC 2.0 support; shouldn't it have the OBJECT data type?

    --
    Tiller's Rule: Never use a word in written form that you've only heard and never read. You will end up looking foolish.
  16. Re:This makes sense... by supersnail · · Score: 2, Informative

    IBM picked this up when they grabbed informix.

    It is used extensively within IBM java based projects. (WSAD - the websphere IDE come with Cloudscape and works with cloudscape by default).

    But its quite difficult to sell for two reasons.

    One IBMs database brand is DB2, which these days scales down to small hardware.

    Two cloudscapes biggest plus is that it is implemented as a single jar file, but, how do you collect license fees when anyone can copy and use your jar file?

    --
    Old COBOL programmers never die. They just code in C.
  17. "Fair" comparison by vlad_petric · · Score: 3, Informative
    The problem with HSQL is that it lives completely in memory; consequently it just can't possibly meet the Durability requirement from ACID (IOW, once you committed a transaction it's not the case that it's on permanent storage). Disk is one of the biggest bottlenecks with databases, so I would expect HSQL to actually be significantly faster with update transactions.

    So, no, the comparison isn't fair at all.

    --

    The Raven

    1. Re:"Fair" comparison by Anonymous Coward · · Score: 1, Informative

      You are mistaken: HSQL (http://hsql.sourceforge.net) is defunct; the DB being discussed here is HSQLDB (http://hsqldb.sourceforge.net/), which can indeed persist data to disk (as others have mentioned).

  18. Re:vs HSQL? by brienv · · Score: 5, Informative

    We use both HSQL and Derby and our experience was that while HSQL was great for small databases, it started to become impractical for medium-to-large databases. Just doing a SELECT Count(*) FROM Foo (which should be instant) can take 30 seconds or more on a large table. Also, if you do a lot of updating (incrementing statistics records, for instance) the table size can get out of hand quickly since each update effectively adds a new record to the table file (until you compact it).

    Here are some preliminary notes one of our engineers compiled while investigating adding Derby to our project. They were just preliminary notes so I make no guarantees as to accuracy but they might be helpful... :

    CHAR/VARCHAR/LONG VARCHAR
    Derby strictly enforces the size specification in CHAR and VARCHAR fields. CHAR fields are space extended; non-space data the does not fit in the field raises an exception on insert or update. LONG VARCHAR data cannot be ordered, grouped, or indexed. (Really!) I believe that SQLServer (and possibly MySQL) has these stupid limitations, too. It may go all the back to the SQL-92 spec. HSQLDB, on the otherhand, ignores all size specifications, treating CHAR/VARCHAR/LONG VARCHAR as synonymns for java.lang.String.

    TOP/LIMIT
    Derby does not support the TOP or LIMIT syntax. There appears to be a "FIRST n ROWS ONLY" syntax that was added to DB2 that never found its way into Cloudscape.

    Case sensitivity
    Derby appears to treat all columns as case sensitive; and there appears no way to change this. HSQLDB, on the otherhand, can be configured on a field-by-field basis. (SET IGNORECASE is used for the database default; and VARCHAR_IGNORECASE is used as the data declaration.)

    IDENTITY fields
    Derby uses the bizarre syntax GENERATE ALWAYS AS IDENTITY. This also does not imply that the field is a primary key. So, "IDENTITY" in HSQLDB becomes "GENERATED ALWAYS AS IDENTITY PRIMARY KEY". Derby allows specification of initial value and increment.

    GENERATE ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 2)

    Performance
    Derby is nearly instantaneous for COUNT(*) queries on databases with large number of rows. HSQLDB appears to count the rows, resulting in very poor performance. Derby appears to have a better architecture for large databases. Queries seem to run in time proportional to the size of the result set. Many simple HSQLDB queries run in time proportional to the size of the database.

    CHECK constraints
    Derby supports CHECK constraints, e.g.,
    size INTEGER DEFAULT 0 NOT NULL CHECK (size >= 0)
    disposition CHAR(1) DEFAULT '+' NOT NULL CHECK (disposition IN ('+', '-', 'B', 'M', 'Q'))

    FOREIGN KEY constraints
    Derby supports inline foreign key declarations with implied column matching, e.g.,
    smtpID CHAR(17) NOT NULL REFERENCES InboxEvents ON DELETE CASCADE
    HSQLDB requires table-level contraints with explicit column matching:
    FOREIGN KEY (smtpID) REFERENCES InboxEvents (smtpID) ON DELETE CASCADE

    Cheers,
    Brien Voorhees
    Red Condor
    Corporate anti-spam gateway service for less than $2/user/month