IBM Donates Java Database App. to Apache Foundation
the_pooh_experience writes "IBM has announced that it will open up Cloudscape by giving it to the Apache Software Foundation. Cloudscape, a small footprint Java database, is primarily used for small scale websites and point-of-sale systems. Its new, opensource name will be 'Derby.' Cloudscape (originally created by Informix, and purchased by IBM in 2001) has been valued by IBM at $85M."
Geez is the NYT dumb. Putting something in the "public domain" means you relinguish control. It's owned equally by everyone. Choosing an "open source" license means you keep control. If you're careful about how you do it, you can even change the license terms a bit later. MySQL is constantly tweaking their terms because they're the sole copyright owner. Sure, it's available under the GPL, but they can tweak the terms for preferred customers. And they do! That's still their perogative because the code is NOT in the public domain.
"...has been valued by IBM at $85M."
Now, it's free, so it's worthless.
In any case it's cool they donated it. Being a database developer myself, I'm extremely wary of the "you don't need a DBA" claim, but regardless of the hype it looks like an interesting product that will fit in well with the Apache lineup.
This Like That - fun with words!
Leave it to NYT to misinform people. The article says that IBM put the code "in the public domain". The license by which the Apache foundation will distribute this is certainly NOT public domain. It later says "Apache will hold the licensing and intellectual property rights to the Cloudscape code."
I wish people would stop mixing these things with public domain. Apache's license, GPL, etc., are forms of copyright, and are NOT public domain.
Cloudscape is hardly dead - it shows up prominently in Websphere Application Developer as the default embedded DB for EJB data. It feels a lot like MS Access - simple, quick, and dirty.
The Cloudscape homepage: Cloudscape
And more details with links to PDF documents: Features and Benefits
I would guess that mysql would be faster for simple stuff, but Cloudscape could give it a run for it's money with support for more complex SQL.
Wouldn't know how it compares agains postgresql...
//TheToon
Not really sure how it compares to mySql or postgres, but I loaded a 50+ million row table with a non index timestamp field to Cloudscape and MSSQL. Both took about 3 seconds to return a query returning a unique row (ie a row updated on a specific date and time) on this field on a 2ghz intel machine with 1GB RAM.
Firebird SQL was about the same. Next Im going to try HSQL.
I would be interested in anybody elses experiments?
On the other hand, it's still a (relative) memory hog.
While I haven't used Cloudscape in a very long time, I imagine this is more competition to other Java open source databases like HSQL, Axion, or McKoi.
Most of these databases are used by "embedding" them into an application (something not uncommon in Java programming), not as a standalone database server like Oracle or Postgres. Of course, like I said, it's been a long time since I looked at Cloudscape so it could have changed to be more of a standlone server.
I'm also surprised I haven't heard more about this in Apache, but I imagine it will first go through the Apache Incubator to sort out any legal issues and then end up somewhere in the Apache Database project. If anyone has more info, I'm interested to know.
Who said Freedom was Fair?
I think that Slashdot runs MySQL. So saying that MySQL cannot run big webapps is a bit of an underestimation :-)
Anyway, personally, I see it more as a competitor to hsqldb, which is also an embedded java DBMS. Or sqlite, although the latter is written in C++. It has the potential to become popular as a DBMS embedded in applications, but I don't think it is usable as a real stand-alone DBMS, such as MySQL.
Great!!!
:)
Maybe slashdot can used it to stop the 503 errors
Of anything out there I think Cloudscape is most similar to Berkeley DB for Java (an in-process DB). The comment about it being a stepping-stone to DB2 could be made about any JDBC-compliant DBMS...IBM just happens to favor theirs ;-)
Good thing it's Derby and not Firebird.
Anyway - why bother renaming and what is "Open Source Name"?
MySQL is definitely ready for heavy loads
You're right. After all, it performs so well for Slashdot...
Cloudscape was originally crated by Cloudscape, Inc. (I contracted for them at one time), which was later acquired by Informix.
At the time, it was a fairly complete and well-performing database with some nifty multi-database synchronization features, so even though I'm not involved in Java programming anymore this can turn out to be a quite interesting addition to Joe. A. Opensourcecoder's toolkit.
Why is it that the proponents of "one nation under God" are so eager to get rid of "liberty and justice for all"?
You obviously haven't used java within the past 3 years or so.... Its speed on every platform I've developed on is no different then native speed, and in many scenarios its faster because of the many optimizations that java makes to your code and also from years of optimizing their own algorithms. This could be debated well... forever, but anymore if you need something faster then java then you should probably be using assembly. Speed and overall performance has only gotten better with the new 1.5 VM as well (It's now known as 5.0, and its still in beta but very useable).
Regards,
Steve
I've been using Cloudscape 4.0 in a web environment for a couple of years now, with no database failures of any sort. Cloudscape has a good selection of utilities (bulk loader, CLI and GUI, etc.) It's picky about ANSI SQL, and it supports most of the SQL that I'm interested in, like nested queries, stored procedures, etc. I'm using it as an imbedded database (just presenting data, not writing anything while in production), so I can't speak to the speed in an OLTP environment, but for my purposes, I'm absolutely delighted with it.
I guess I don't see any problem with the fact that they benefit from this decision, nor am I surprised. The most powerful argument for open source is not a political one, but a business one: cheaper, more secure, fewer bugs. Since IBM has moved its business into service, open-sourcing their tools makes good sense. And, they only get "free geek labor" if the tool is actually useful to people, in which case its a win-win.
Funny how the word Apache in the article is linked to the stock ticker for APA. (Or may be not so funny) For the record - The Apache Software Foundation is a registered non-for-profit 501 c3 corporation incorporated in Delaware, and as such it does not have stock but rather can hand out membership to make one a stakeholder.
It's hardly a dead technology. Cloudscape is to be used as the local data store in the next generation of IBMs messaging products (e.g. Workplace which is built on eclipse RCP - see www.lotus.com)
Lotus say that the Notes client will 'converge' with the Workplace client in the version 8 release timeframe so that'll put an eclipse runtime and cloudscape DB on most every corporate Notes desktop in the next 2-3 years.
What you're seeing is IBM seeding the developer marketplace with technology (Eclipse, Cloudscape) in order to reap dividends in the form of an established base of technologists familiar with the underpinnings of their commercial products.
You getting the picture?
Add to this some context.
* Most web applications are not written in C++/C
* More and more client-applications are being written in Java/.Net due to maintainability
* There is an impedance mismatch between OO systems and RDBMS systems
** Bridging this gap often involves very non-performant abstractions:
*** wrapping bean-objects around rows
*** storing intermediate copies of beans for caching
*** making copies of beans for transactional purposes.
*** redundantly applying data-constraint rules
Essentially re-inventing the RDBMS wheel.
Thus, if you're already going to write the application in Java, then there is a tremendous advantage to avoiding the performance bottlenecks of the impedance mismatch.
Think of what a c/c++ database does in the best case.. It compiles a SQL script, loads internal relationships to columns/rows.. Accesses the in-memory indexes, and then formats/serializes the in-memory rows for output.
Java has to deserialize the text-stream, instantiate numerous objects; possibly unicodifying the data. Then whatever abstraction layers may be applied to the raw object-array result-set have to be applied.
If the data was locally available, then it could be stored in such a way that, for read-only access, it might be possible to avoid copying, and merely have it return a wrapper for the raw data. Zero latency, and practically zero additional extra work. While this is effectively the same as cached data, here we only store the data once on disk and once in memory.
"hsqldb" is an example that pretty much does the above. You still get a SQL interface if you desire though. Only catch is that hsqldb isn't as feature-rich as many RDBMS systems yet. I'm sure the IBM java-database is merely a feature-rich sister of hsqldb.
And don't forget that many java API's still use raw c-code to do intensive or tight-data-structure work.
-Michael
Another interesting, open source Java database is McKoi SQL Database, a GPL-licensed Java database with all kinds of nifty features.
Things are getting interesting for JBoss developers: JBoss ships with HSQL, supports McKoi nicely, and now we get Cloudscape thrown into the mix. Sweet.
MORTAR COMBAT!
We are a successful enterprise software ASP and used Cloudscape during our first 1.5 years before switching to a more robust database. Cloudscape powered our applications for 40+ customers globally. While it performed well given its small footprint, a huge problem for us was the fact that it does not release unused space -- the only way to do this is to run COMPRESS. With growing transaction volume we began running into random database balooning problems where the size of Cloudscape grew from 100MB up to 30 or 40GB depending on how long it was running. The fix required a COMPRESS command which takes anywhere from 1 to 48 hours depending on the size of the database and the amount of physical memory available.
Switching to Oracle and SQL Server eliminated this problem entirely. In addition, performance has been increased literally ten-fold just from the switch (no changes were made to our code or schema when this performance increase was measured).
While I would recommend Cloudscape for smaller, non-critical applications, it is not ready for real-world enterprise apps. I look forward to the improvements the open source community might bring -- from a cost perspective I'd certainly like to see us switch back someday.
Regards,
Matt
The main thing I've felt that has been holding Star/Open office back is a need of a database as easy as MS Access.
I know it's a different language, but work with me for a second.
Yes, Access sucks as a DB, but it's good for three things. First, it's a quick and dirty way to store data. Secretaries and analysts use it, dump their data in a little file, put it on a floppy, bring it home, work on stuff at home, and bring it back on a floppy the next day. That is the ultimate selling point of file based databases. Even with Open Office's database tools, I have to know something about being a DBA - starting mysqld, db security, etc. Second, our DBAs love it because it's a graphical frontend to ODBC datbases. It gives semi-cluefull non-techs a way to see data. Finally, you can actually drop it onto a webserver and drive databases with it. Biases aside, it did gather them a following in the late 90's when everybody was a "developer" doing websites.
Any sort of MSOffice competitors have taken a while to solve these three needs elegantly. Looking at the IBM site, it looks like Cloudscape, with the embeded and network connectivity features, can be a foundation for something that can fill all three needs.
> MySQL is definitely ready for heavy loads
heavy transactional read loads for non-critical apps perhaps.
- Not heavy DSS/OLAP read loads though (where indexes don't work well and you want partitioning to bypass 95% of your rows). See Oracle, Informix & DB2 to see how this is done and the results it achieves.
- Haven't seen a proper benchmark but antecdotal evidence points to problems that MySQL has scaling to meet much write traffic. Postgresql, Firebird, etc on the inexpensive/free side appear to be better choices for this kinds of applications.
- Aren't online backups unavailable except through separately-licensed (and expensive) products?
- Then you've got the entire managability issue - on larger projects in which you desperately want the kind of functionality that MySQL AB has claimed that 95% of database applications don't need and which they've failed to support well: like database-enforced data quality constraints (referential, uniqueness, and check constraint declaratives). Add to that the lack of flexibility that comes from various missing features like views & stored procedures. Add to that the problems porting their non-standard SQL. Lastly, add to all of the above their massive list of exception-handling problems - in which errors silently fail.
Nah, MySQL is a nice little database. But unless 'heavy loads' means non-critical, read-only, index-oriented loads - I think that there are about a dozen better options available.
Oh yeah, and no - cloudscape isn't a competitor for mysql in general. They each bring different strengths to the table.