Why Aren't You Using An OODMS?

Posted by ryuzaki0 on Friday May 4, 2001 @02:00AM from the because-you-have-another-mantra? dept.

Dare Obasanjo contributed this piece about a subject that probably only a very few people have ever taken the time to consider, or had to. Below he asks the musical question "Why aren't you using an Object Oriented Database Management System?"Update: 05/04 02:11 PM by H :This is also running on K5 - yes, that's on purpose, and yes, Dare, myself and Rusty all know. *grin*

Why Aren't You Using An Object Oriented Database Management System?

In today's world, Client-Server applications that rely on a database on the server as a data store while servicing requests from multiple clients are quite commonplace. Most of these applications use a Relational Database Management System (RDBMS) as their data store while using an object oriented programming language for development. This causes a certain inefficency as objects must be mapped to tuples in the database and vice versa instead of the data being stored in a way that is consistent with the programming model. The "impedance mismatch" caused by having to map objects to tables and vice versa has long been accepted as a necessary performance penalty. This paper is aimed at seeking out an alternative that avoids this penalty.

What follows is a condensed version of the following paper; An Exploration of Object Oriented Database Management Systems, which I wrote as part of my independent study project under Dr. Sham Navathe.

Introduction

The purpose of this paper is to provide answers to the following questions

What is an Object Oriented Database Management System (OODBMS)?
Is an OODBMS a viable alternative to an RDBMS?
What are the tradeoffs and benefits of using an OODBMS over an RDBMS?
What does code that interacts with an OODBMS look like?

Overview of Object Oriented Database Management Systems

An OODBMS is the result of combining object oriented programming principles with database management principles. Object oriented programming concepts such as encapsulation, polymorphism and inheritance are enforced as well as database management concepts such as the ACID properties (Atomicity, Consistency, Isolation and Durability) which lead to system integrity, support for an ad hoc query language and secondary storage management systems which allow for managing very large amounts of data. The Object Oriented Database Manifesto [Atk 89] specifically lists the following features as mandatory for a system to support before it can be called an OODBMS; Complex objects, Object identity, Encapsulation , Types and Classes ,Class or Type Hierarchies, Overriding,overloading and late binding, Computational completeness , Extensibility, Persistence , Secondary storage management, Concurrency, Recovery and an Ad Hoc Query Facility.

>From the aforementioned description, an OODBMS should be able to store objects that are nearly indistinguishable from the kind of objects supported by the target programming language with as little limitation as possible. Persistent objects should belong to a class and can have one or more atomic types or other objects as attributes. The normal rules of inheritance should apply with all their benefits including polymorphism, overridding inherited methods and dynamic binding. Each object has an object identifier (OID) which used as a way of uniquely identifying a particuler object. OIDs are permanent, system generated and not based on any of the member data within the object. OIDs make storing references to other objects in the database simpler but may cause referential intergrity problems if an object is deleted while other objects still have references to its OID. An OODBMS is thus a full scale object oriented development environment as well as a database management system. Features that are common in the RDBMS world such as transactions, the ability to handle large amounts of data, indexes, deadlock detection, backup and restoration features and data recovery mechanisms also exist in the OODBMS world.

A primary feature of an OODBMS is that accessing objects in the database is done in a transparent manner such that interaction with persistent objects is no different from interacting with in-memory objects. This is very different from using an RDBMSs in that there is no need to interact via a query sub-language like SQL nor is there a reason to use a Call Level Interface such as ODBC, ADO or JDBC. Database operations typically involve obtaining a database root from the the OODBMS which is usually a data structure like a graph, vector, hash table, or set and traversing it to obtain objects to create, update or delete from the database. When a client requests an object from the database, the object is transferred from the database into the application's cache where it can be used either as a transient value that is disconnected from its representation in the database (updates to the cached object do not affect the object in the database) or it can be used as a mirror of the version in the database in that updates to the object are reflected in the database and changes to object in the database require that the object is refetched from the OODBMS.

Comparisons of OODBMSs to RDBMSs

There are concepts in the relational database model that are similar to those in the object database model. A relation or table in a relational database can be considered to be analogous to a class in an object database. A tuple is similar to an instance of a class but is different in that it has attributes but no behaviors. A column in a tuple is similar to a class attribute except that a column can hold only primitive data types while a class attribute can hold data of any type. Finally classes have methods which are computationally complete (meaning that general purpose control and computational structures are provided [McF 99]) while relational databases typically do not have computationally complete programming capabilities although some stored procedure languages come close.

Below is a list of advantages and disadvantages of using an OODBMS over an RDBMS with an object oriented programming language.

Advantages

Composite Objects and Relationships: Objects in an OODBMS can store an arbitrary number of atomic types as well as other objects. It is thus possible to have a large class which holds many medium sized classes which themselves hold many smaller classes, ad infinitum. In a relational database this has to be done either by having one huge table with lots of null fields or via a number of smaller, normalized tables which are linked via foreign keys. Having lots of smaller tables is still a problem since a join has to be performed every time one wants to query data based on the "Has-a" relationship between the entities. Also an object is a better model of the real world entity than the relational tuples with regards to complex objects. The fact that an OODBMS is better suited to handling complex,interrelated data than an RDBMS means that an OODBMS can outperform an RDBMS by ten to a thousand times depending on the complexity of the data being handled.
Class Hierarchy: Data in the real world is usually has hierarchical characteristics. The ever popular Employee example used in most RDBMS texts is easier to describe in an OODBMS than in an RDBMS. An Employee can be a Manager or not, this is usually done in an RDBMS by having a type identifier field or creating another table which uses foreign keys to indicate the relationship between Managers and Employees. In an OODBMS, the Employee class is simply a parent class of the Manager class.
Circumventing the Need for a Query Language: A query language is not necessary for accessing data from an OODBMS unlike an RDBMS since interaction with the database is done by transparently accessing objects. It is still possible to use queries in an OODBMS however.
No Impedence Mismatch: In a typical application that uses an object oriented programming language and an RDBMS, a signifcant amount of time is usually spent mapping tables to objects and back. There are also various problems that can occur when the atomic types in the database do not map cleanly to the atomic types in the programming language and vice versa. This "impedance mismatch" is completely avoided when using an OODBMS.
No Primary Keys: The user of an RDBMS has to worry about uniquely identifying tuples by their values and making sure that no two tuples have the same primary key values to avoid error conditions. In an OODBMS, the unique identification of objects is done behind the scenes via OIDs and is completely invisible to the user. Thus there is no limitation on the values that can be stored in an object.
One Data Model: A data model typically should model entities and their relationships, constraints and operations that change the states of the data in the system. With an RDBMS it is not possible to model the dynamic operations or rules that change the state of the data in the system because this is beyond the scope of the database. Thus applications that use RDBMS systems usually have an Entity Relationship diagram to model the static parts of the system and a seperate model for the operations and behaviors of entities in the application. With an OODBMS there is no disconnect between the database model and the application model because the entities are just other objects in the system. An entire application can thus be comprehensively modelled in one UML diagram.

Disadvantages

Schema Changes: In an RDBMS modifying the database schema either by creating, updating or deleting tables is typically independent of the actual application. In an OODBMS based application modifying the schema by creating, updating or modifying a persistent class typically means that changes have to be made to the other classes in the application that interact with instances of that class. This typically means that all schema changes in an OODBMS will involve a system wide recompile. Also updating all the instance objects within the database can take an extended period of time depending on the size of the database.

Who is currently using an OODBMS to handle mission critical data

The following information was gleaned from the ODBMS Facts website.

The Chicago Stock Exchange manages stock trades via a Versant ODBMS.
Radio Computing Services is the world's largest radio software company. Its product, Selector, automates the needs of the entire radio station -- from the music library, to the newsroom, to the sales department. RCS uses the POET ODBMS because it enabled RCS to integrate and organize various elements, regardless of data types, in a single program environment.
The Objectivity/DB ODBMS is used as a data repository for system component naming, satellite mission planning data, and orbital management data deployed by Motorola in The Iridium System.
The ObjectStore ODBMS is used in SouthWest Airline's Home Gate to provide self-service to travelers through the Internet.
Ajou University Medical Center in South Korea uses InterSystems' Cachè ODBMS to support all hospital functions including mission-critical departments such as pathology, laboratory, blood bank, pharmacy, and X-ray.
The Large Hadron Collider at CERN in Switzerland uses an Objectivity DB. The database is currently being tested in the hundreds of terabytes at data rates up to 35 MB/second.
As of November, 2000, the Stanford Linear Accelerator Center (SLAC) stored 169 terabytes of production data using Objectivity/DB. The production data is distributed across several hundred processing nodes and over 30 on-line servers.

Interacting With An OODBMS

Below are Java code samples for accessing a relational database and accessing an object database. Compare the size of the code in both examples. The examples are for an instant messaging application.

Validating a user.

Java code accessing an ObjectStore(TM) database
import COM.odi.*; import COM.odi.util.query.*; import COM.odi.util.*; import java.util.*; try { //start database session Session session = Session.create(null, null); session.join(); //open database and start transaction Database db = Database.open("IMdatabase", ObjectStore.UPDATE); Transaction tr = Transaction.begin(ObjectStore.READONLY); //get hashtable of user objects from DB OSHashMap users = (OSHashMap) db.getRoot("IMusers"); //get password and username from user String username = getUserNameFromUser(); String passwd = getPasswordFromUser(); //get user object from database and see if it exists and whether password is correct UserObject user = (UserObject) users.get(username); if(user == null) System.out.println("Non-existent user"); else if(user.getPassword().equals(passwd)) System.out.println("Successful login"); else System.out.println("Invalid Password"); //end transaction, close database and retain terminate session tr.commit(); db.close(); session.termnate(); } //exception handling would go here ...
Java JDBC code accessing an IBM's DB2 Database(TM)

import java.sql.*; import sun.jdbc.odbc.JdbcOdbcDriver; import java.util.*; try { //Launch instance of database driver. Class.forName("COM.ibm.db2.jdbc.app.DB2Driver").newInstance(); //create database connection Connection conn = DriverManager.getConnection("jdbc:db2:IMdatabase"); //get password and username from user String username = getUserNameFromUser(); String passwd = getPasswordFromUser(); //perform SQL query Statement sqlQry = conn.createStatement(); ResultSet rset = sqlQry.executeQuery("SELECT password from user_table WHERE username='" + username +"'"); if(rset.next()){ if(rset.getString(1).equals(passwd)) System.out.println("Successful login"); else System.out.println("Invalid Password"); }else{ System.out.println("Non-existent user"); } //close database connection sqlQry.close(); conn.close(); } //exception handling would go here ...
There isn't much difference in the above examples although it does seem a lot clearer to perform operations on a UserObject instead of a ResultSet when validating the user.
Getting the user's contact list.

Java code accessing an ObjectStore(TM) database

import COM.odi.*; import COM.odi.util.query.*; import COM.odi.util.*; import java.util.*; try { /* start session and open DB, same as in section 1a */ //get hashmap of users from the DB OSHashMap users = (OSHashMap) db.getRoot("IMusers"); //get user object from database UserObject c4l = (UserObject) users.get("Carnage4Life"); UserObject[] contactList = c4l.getContactList(); System.out.println("This are the people on Carnage4Life's contact list"); for(int i=0; i <contactList.length; i++) System.out.println(contactList[i].toString()); //toString() prints fullname, username, online status and webpage URL /* close session and close DB, same as in section 1a */ }//exception handling code

Java JDBC code accessing an IBM's DB2 Database(TM)

import java.sql.*; import sun.jdbc.odbc.JdbcOdbcDriver; import java.util.*; try { /* open DB connection, same as in section 1b */ //perform SQL query Statement sqlQry = conn.createStatement(); ResultSet rset = sqlQry.executeQuery("SELECT fname, lname, user_name, online_status, webpage FROM contact_list, user_table" + "WHERE contact_list.owner_name='Carnage4Life' and contact_list.buddy_name=user_table.user_name"); System.out.println("This are the people on Carnage4Life's contact list"); while(rset.next()) System.out.println("Full Name:" + rset.getString(1) + " " + rset.getString(2) + " User Name:" + rset.getString(3) + " OnlineStatus:" + rset.getString(4) + " HomePage URL:" + rset.getString(5)); /* close DB connection, same as in section 1b*/ }//exception handling code

The benefits of using an OODBMS over an RDBMS in Java slowly becomes obvious. Consider also that if the data from the select needs to be returned to another method then all the data from the result set has to be mapped to another object (UserObject).
Get all the users that are online.

Java code accessing an ObjectStore(TM) database

import COM.odi.*; import COM.odi.util.query.*; import COM.odi.util.*; import java.util.*; try{ /* same as above */ //use a OODBMS query to locate all the users whose status is 'online' Query q = new Query (UserObject.class, "onlineStatus.equals(\"online\""); Collection users = db.getRoot("IMusers"); Set onlineUsers = q.select(users); Iterator iter = onlineUsers.iterator(); // iterate over the results while ( iter.hasNext() ) { UserObject user = (UserObject) iter.next(); // send each person some announcement sendAnnouncement(user); } /* same as above */ }//exception handling goes here

Java JDBC code accessing an IBM's DB2 Database(TM)
import java.sql.*; import sun.jdbc.odbc.JdbcOdbcDriver; import java.util.*; try{ /* same as above */ //perform SQL query Statement sqlQry = conn.createStatement (); ResultSet rset = sqlQry.executeQuery ("SELECT fname, lname, user_name, online_status, webpage FROM user_table WHERE online_status='online'"); while(rset.next()){ UserObject user = new UserObject (rset.getString(1),rset.getString (2),rset.getString(3),rset.getString (4),rset.getString(5)); sendAnnouncement(user); } /* same as above */ }//exception handling goes here

List of Object Oriented Database Management Systems
Proprietary

Conclusion

The gains from using an OODBMS while developing an application using an OO programming language are many. The savings in development time by not having to worry about separate data models as well as the fact that there is less code to write due to the lack of impedance mismatch is very attractive. In my opinion, there is little reason to pick an RDBMS over an OODBMS system for newapplication development unless there are legacy issues that have to be dealt with.

3 of 213 comments (clear)

Min score:

Reason:

Sort:

Why OODBMSs did not take over the world by geophile · 2001-05-03 23:03 · Score: 5

I am one of the founders of Object Design (now Excelon Corp.) We had the slickest OODBMS -- persistence was implemented by taking over memory mapping, (no "overloading the arrow operator"). It was the least obtrusive OODBMS. Other systems of the day required you to use different string libraries or forego C/C++ standard arrays (for example). Other systems arguably had better scalability or concurrency models.
As someone else has pointed out, OODBMSs require a very different skill set. The problem isn't that your typical SQL developer didn't have these skills. The problem is that the things were ever referred to as database systems.
If you walk into a potential customer selling a "database system", then the database guys come and hear what you have to say. They ask about SQL support and point-and-click development tools. They are going to be looking for very high levels of concurrency, at isolation levels below serializable.
Selling a "database system" meant that once we got past the early adopters, we were selling against Oracle and we hit a wall. What we should have done from day one was to sell persistence for C++. We did start out like this, e.g. trying to convince ECAD vendors to build their products on top of ObjectStore. That had some limited success because the customers knew that they needed persistence, but they were C/C++ hackers at heart, and an RDBMS was a poor compromise. A "database for C++ with no impedance mismatch" sounds great to someone writing a 3d modeler. We then went on to apply the same logic selling to satisfied RDBMS users without changing our strategy, and that's when things stalled.
That strategy was necessary in some ways, because we were venture-funded, and the VCs weren't going to be happy with a small niche. They wanted something that would get into every insurance company and bank. However, by aiming high and failing (by VC standards), we abandoned our natural market too soon and avoided becoming a small success in that market.
OODBMS Not According to Best Database Theory by leandrod · 2001-05-04 00:59 · Score: 5

The issue is that OODBMSs do not conform neither to current database best practices, nor to theory.

Relating to best practices, you should know already from other, better-rated comments in this thread: you should design your data before your application, OODBMSs make it hard; you should strive for independence of the logical and physical layers of your database, keeping data independence and shifting the performance optimization issues to the DBMS' optimizer; OIDs hinder the designation of candidate keys, of which the primary key is a special case, and thus hinder a lot of data integrity checkings that should be done by referential integrity. And we could go on and on.

As for the practical implications of not conforming to these best practices, any schema change do not only need an application recompilation, but also that you rethink the data access path (also known as a query's access plan); you won't be able to keep several logical schemas to different users, and the identity of the user's view with the physical layout will force you to optimize only for the most common case, instead of leaving it up to the DBMS to create the best access plans.

All this is much better explained in Database Debunkings, a site co-maintained by Chris J Date, author of the best database books I've ever read; you can find a list of his available books also there.

As for theory, there is no real substitute for the relational database model theory. As Linus Torvalds thinks that microkernels were a good idea but misguided, wielding no practical nor theoretical improvements, so OODBMSs sounds nice but offer no real improvements over RDBMSs. This is not to say that everything you will ever need will be handled properly by your SQL DBMS. The point is exactly that people have went for OODBMSs because they thought that SQL was relational, and found it wanting. The problem is that SQL never was truly relational, just an approximation of it. Date has a whole book on it, called The Third Manifesto.

Summing up, what I am really trying to find is some proper implementation of the relational database model ideals, which should give us practically all of the advantages of OODBMSs without their cons. I have just been informed of Suneido, but have not investigated it fully... it's a pity it is Win32, not POSIX.

--
Leandro Guimarães Faria Corsetti Dutra
DBA, SysAdmin

--
Leandro GuimarÃ£es Faria Corcete DUTRA
DA, DBA, SysAdmin, Data Modeller
GNU Project, Debian GNU/Lin
Re:You didn't read the article, did you? by Mr.+McGibby · 2001-05-04 00:09 · Score: 5

Aren't you supposed to design an application before implemnting
it in any way including putting data in a DB? I've worked at two companies and
had a ton of projects in school and none involved implemnting the database
before the application was designed.

I'd like to know what world you are living in. In the real world, most databases
are legacy databases and FULL of data. I've had to design applications around
databases for years now. In my field (Programming for Engineers) the data is king
and people need to access it in multiple ways. True, if you are designing a
system from the ground up, then you will be able to design the DB and
make it nice and pretty. This is seldom the case in any case but web development.

This is simply hogwash. RDBMSs are by their nature
non-generic espoecially when one adds foreign keys and constaints to a system
which are necessary for any decent sized application. On the other hand the
entire point of object oriented programming is creating generic reusable
components. With the ability to use inheritance and polymorphism in an ODBMS I
see no reason why you believe an RDBMS is more generic.

'Generic' may be the wrong word here. A better one would be 'simpler'. A lot of applications
just don't need all the OO stuff. The reason that RDBMSs are so pervasive is
because most data can be represented well and in an easy to understand way with
just tables and keys.

Learning
OODBMS techniques is mainly learning how to use another API in your bject
Oriented programming language of choice (well C++, Java or Smalltalk) versus
learning SQL and relational database theory. If people could learn SQL which is
completely unrelated to any other aspect of their programming experience then
adding OODBMS techniques as a skillset would be trivial. Of course, if people
don't realize that alternatives to RDBMSs exist then they won't learn these
techniques. That is more important because management can't find people who have
these skills if developers don't go out and learn these techniques.

Developers aren't the only ones who have to query the database. In my shop,
we have 10-20 people querying the same database. Many of whom have spent a lot
of time learning SQL. Most of the people who need to look at the data are
not able to pick up a new query language quickly enough. SQL is simple
enough to learn. RDBMSs are simple and easy to understand. With an OODBMS,
these people have to be trained on what the heck OO is. This is not an easy
concept for a non-progammer. On the other hand, tell someone that the database
is a collection of tables, and they can easily understand.

Now I just realized you didn't read the
article. People have measured gains in the range of ten to a thousandfold
increase in performance, these are not incremental. Secondly the primary benefit
is that it means you have to write less code and don't have to worry
about multiple paradigms at once when implementing an application.

Sure. I'll believe it when I see it. This sounds like marketing hype to me.
Sounds like someone who didn't know how to program for an RDBMS wrote some
crappy code. Correctly written code for an RDBMS would not experience these
kinds of gains when converted to an OODBMS. The overhead for the conversion
process could be this large, but only if the original code is crap.

--
Mad Software: Rantings on Developing So