Prevayler Quietly Reaches 2.0 Alpha, Bye RDBMS?

← Back to Stories (view on slashdot.org)

Prevayler Quietly Reaches 2.0 Alpha, Bye RDBMS?

Posted by Cliff on Tuesday September 23, 2003 @11:52AM from the lightweight-louie-vs-the-data-behemoth dept.

ninejaguar asks: "Slashdot did an article on an Open Source product called Prevayler, which could theoretically resolve all the problems associated with OO's rough courtship with Relational databases. Slashdot covered Prevayler when it was still 1.x. Despite fear, doubt, and memory concerns, it has reached 2.0 alpha. Is anyone currently using this non-database solution in production? If so, has it sped development because of the lack of OO-to-RDBMS complexity? Was there a significant learning curve to speak of? The LGPL'd product could be incorporated into proprietary commercial software, and few might know about it. Is anyone considering using it in a transactional environment where speed is the paramount need? And, are there any objections to using Prevayler that haven't been answered at the Prevayler wiki? Would those who use MySQL find Prevayler to be a better solution because it's tiny (less than 100kb), 3000 times faster and is inherently ACID compliant?" Update: 09/24 19:25 GMT by C :Quite a few broken links, now fixed.

"We've used relational databases for years despite incompatibilities in SQL implementation. Accessing them from an OOP paradigm has been so tedious, that Object-Relational mapping technologies have sprouted all over the Open Source landscape. Some competing examples and models are Hibernate, OJB, TJDO, XORM, and Castor; which in turn have supporting frameworks such as Spring and SQLExecutor. Because SQL is the dominant form of interfacing with the data in an RDBMS, there's now a specification to offer it a friendlier OO face.

Most of the above, including the SQL-variants, arguably appear to add yet another layer of complexity (even if only at the integration level) where they should be taking complexity away. These solutions are put together by some very smart people, but it's inescapable to get that feeling someone is missing the forest (simple answer) because all the trees (incompatible models) are in the way. If there are so many after-the-fact solutions attempting to simplify relational database access and manipulation from OO, isn't it reasonable to think that there is something generally wrong with trying to cobble-together two disparate concepts with what are essentially high-caliber hacks? Is Prevayler a better way?"

14 of 444 comments (clear)

Min score:

Reason:

Sort:

3000 times faster than Mysql? by Anonymous Coward · 2003-09-23 11:53 · Score: 4, Insightful

Is that really possible? How do you even benchmark that?
Project Promotion by Bingo+Foo · 2003-09-23 11:55 · Score: 5, Insightful

I'm all for trying to use Slashdot to promote your pet project, but don't couch your story in questions about people's use of your admittedly relatively unknown software.

--
taken! (by Davidleeroth) Thanks Bingo Foo!
1. Re:Project Promotion by baka_boy · 2003-09-23 17:12 · Score: 4, Insightful
  
  First off, SQL is an idealized, functional language for querying large datasets -- in theory, you can run any imaginable query, but in reality, you can't touch an un-indexed field on most production databases unless you've got *lots* of horsepower to burn, and very patient (read: non-existent) users. Personally, I find that there are a pretty large number of queries that are also pretty difficult to write in SQL.
  
  Really, it's not that different from the procedural-vs-functional or static-vs-dynamic typing issues in language design: a holy war that no one's going to win, except for the developers who learn to adopt the best pieces from each side without making dire claims about the imminent death of either side.
Persistance does not make a DB by Godeke · 2003-09-23 11:58 · Score: 5, Insightful

This would be great for projects where interoperability isn't an issue or only occurs via edge connections like SOAP. However, I generally would be wary of a "database" which is only accessible in Java, via unique interface. What do you do with your Crystal Reports users? How do I get this into data cubes for analysis?

Frankly, this is simply a persistance layer with some nice properties. It *isn't* a database. A database stands at the center of your applications and makes itself available to as wide of an audiance as possible. It shouldn't limit your choice of tools in such an absurd manner.

--
Sig under construction since 1998.
Here's why by Anonymous Coward · 2003-09-23 12:01 · Score: 5, Insightful

A direct quote from your Wike:

Prevalence requires us to have enough RAM on our servers to contain all our business objects.

What do you do if you have a nice big data set that won't fit in memory? Businesses my company works with have millions of customers. Do they have to have all those gigabytes of data in memory to do anything?

Don't come at me with yet another memory resident thing that's supposed to be the greatest ever, when it doesn't come close to addressing the real needs of a database user.
3000 times faster is somewhat misleading by Xeger · 2003-09-23 12:05 · Score: 5, Insightful

If you could cache an entire MySQL database in RAM, I'm sure your MySQL performance would improve dramatically.

If you could then optimize MySQL's search routines for working on memory instead of disk blocks, your MySQL performance would improve even more dramatically. As it is now, MySQL must go through all sorts of contortions, probably using B-Tree-like structures for indexing, and other fanciful datatypes I can't even conceive of without a PhD. The reason for all of this silliness is the fact that MySQL's backing store is disk, not memory.

In a prevalence system like Prevayler, one of the fundamental tenents of the system is that ALL of your objects are ALWAYS in memory, and are only serialized to disk when they change, for persistence.

So...yes: as always, a memory-based system will be three orders of magnitude (or more!) faster than a disk-based system. Prevayler vs. DBMS is no exception to this rule.

But when your website has grown popular, your prevalance database has swelled to 30 gigs and you find yourself hosting it on massive systems with 12 gigs of core memory and another 30 gigs of swap space -- when your memory access times are starting to look like disk access times because of swapping -- well, don't come crying to mwe.

Prevalence is a brilliant solution, for small projects. But they only scale to the size of your physical memory, or slightly (50-100%) larger. You can't expect them to scale beyond that.
Please, enough of the hyperbole bullshit by tstoneman · 2003-09-23 12:11 · Score: 4, Insightful

Please don't go around saying, "Could this be the end of RDBMSes?" That is just a crock of shit, and it really bugs the hell out of me. How stupid do you think we are to have a tagline like that?

Please watch "Bowling for Columbine" and it says that this type of exaggeration by the media, and most notably US media, is driving the Americans crazy paranoid with fear. It seems like the editors have fallen for the same thing, and it should stop now!

Let's have a modicum of dignity and avoid all the hyperbole, please. We are all somewhat tech-savvy, please treat us with the respect that we deserve!

I'm sure the technology is good, but for crying out loud, mainframes are still around 40 years later. RDBMSes are going nowhere for the next 20+years until true AI comes around.
1. Re:Please, enough of the hyperbole bullshit by scubabear · 2003-09-23 12:13 · Score: 4, Insightful
  
  Actually, the technology isn't even that good. What you have is a bunch of objects in memory, the ability to rollback "transactions", and in-memory changes are checkpointed to disk every once in awhile. This has "amateur" written all over it.
2. Re:Please, enough of the hyperbole bullshit by molarmass192 · 2003-09-23 12:54 · Score: 5, Insightful
  
  There's no rollback because there's no concept of a transaction and they even go so far as to try to justify not having transactions. What if a transaction depends on 3 objects getting serialized or none at all, guess you're out of luck. Label this project for what it is, a small non-transactional in-memory db like hsqldb but with less features. The very premise that this could replace a relational database is just a cheap laugh for anybody who has worked with or programmed for an enterprise class database.
  
  --
  
  Good people do not need laws to tell them to act responsibly, while bad people will find a way around the laws-Plato
benefits of an RDBMS by jadavis · 2003-09-23 12:34 · Score: 4, Insightful

I'm not about to give up PostgreSQL.

An RDBMS:
* Allows a wide variety of applications to operate on the data, in a wide variety of languages, at different times or simultaneously.
* Allows you to manipulate the data inside the database before you get it.
* Allows for a lot more storage, which is sometimes important when you need the memory for some other task.

What I like about an RDBMS (like postgres) is that my requirements constantly change. I try something for a while, then we have better ideas, but we need to work with our existing data. An RDBMS allows me a huge amount of flexibility with my data (also the reason I don't use MySQL...) and I've been able to drastically change the way my application works while still making use of the data that I have.

Maybe this is an OK database for some applications where you have the entire thing laid out in a perfect spec with no chance of a change. However, when I need to get at my data with another language, or it takes up more space than I thought, or I figure out that my application needs to change the way it works, I have no clue where to begin with a huge collection of "objects" that now happen to be obsolete. If I have relations, I have a solid representation of my data that an RDBMS can manipulate efficiently according to a fairly mature mathematical model. I'll take that over a collection of persistant objects any day.

That said, I think this has application in areas where you just need some persistent objects, and they need to be fast, and they don't take up much room. I don't encounter that very often, and when I do I usually just use postgres because it seems fast enough when the tables are cached. I suppose if the objects are really intricate this would be a nice system, because you wouldn't have to spend so much code on a mapping. It just seems so much more narrow as far as usefulness.

--
Social scientists are inspired by theories; scientists are humbled by facts.
Re:What a crock of shit by Delirium+Tremens · 2003-09-23 14:01 · Score: 4, Insightful

I am probably going to be considered pedantic with this, but usually people refer to begin and commit. Sometimes, they use rollback too. It's not enter() and exit(). And it's not please() and letsGo() either.
That being said, the other gentlemen in this thread are talking about Two-Phase Commit. Yes, I do understand that Prevayler doesn't even have one-phase-commit to provide things such as Isolation, but that's beside the topic discussed in this thread. So please, let's both stop referring to it. Even if one-phase-commit can be implemented over RMI. But if you want to really simplify this issue to the extreme, you need to tell us how Prevayler handles rollbacks. After that, you could maybe explain to us (and to the RMI engineers at Sun) how to implement the Two-Phase-Commit protocol over RMI with a data store like Prevaylers that is not XA-compliant.
Thank you for playing(*).
(*) I told you I was going to be pedantic.
These folks don't know what a database IS by hey! · 2003-09-23 15:18 · Score: 4, Insightful

I don't mean to say anything against their product, which as far as I know is the greatest object persistence scheme ever hatched.

But they clearly don't know what a database actually is; they're confusing the issue with services that an RDBMS happens to perform as part of its job. It has always been possible to write procedural code that is faster than database queries because underneath a query is turned into a sequene of operations. When building a system to answer a single question, the system will always be faster without the database layer. Building a hash table or B-Tree to do a simple lookup simply can't be beat.

People have lost sight of history. Years ago, we used to keep our data in indexed files, and guess what? They were faster than databases of the day at doing the tasks they were designed to do.

However while databases are slower, in many cases much slower, than procedural code, they have an important property: they can be used to answer unanticipated questions acceptably quickly. How quickly is acceptably quickly? Well, if the database can come back with an answer faster than it takes a skilled programmer to come up with a special purpose program to answer the question, it has done its job. Compare this to their answer to the querying problem: write a java program. And that's a fine answer if having to answer an unanticipated question is a relatively rare event, which no doubt it is for many applications.

This gets to what a database is: it is a collection of information that that is organized to make reuse simple and efficient. This is different from business object re-use, which is about re-using logic; this is about re-using facts. Relational databases are unequaled at this task because they are based on sets and mathematical operations that are closed on these sets. This allows both the user, but more importantly the system's optimizer, to create sequences of operations that meet the user's requests.

These people may have a wonderful system for a lot of purposes, but they're really talking about a particular set of applications, for which their system might be better than storing object data in a database. Probably is, as far as I know. But really. No need for rollback because "transactions are instantaneous"? Well, they would be right if transactions really were instantaneous. HOwever, while their test case may be so fast they appear instantaneous, that's a long way from actually being instantaneous. In the real world bound by the laws of physics, transactions take finite time. Given enough objects to update, or high enough system load, or both, you will have to either (1) accept a possibility, albeit small, of inconsistent transactions being processed or (2) lock all the objects that might affected by a transaction, with attendent possibilties of deadlocking.

In short, I wish them luck; it sounds like they are producing some interesting and useful stuff. However, it isn't a database or a replacement for a database.

--
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Silly Project by cartman · 2003-09-23 17:16 · Score: 5, Insightful

The "Prevayler Team" has written a persistent HashMap with a redo log, using the command pattern. This is exceptionally trivial and is in no way comparable to a database. A database has things like: 4GL query language, referential integrity constraints, data integrity, queryable metadata, separation of logical and physical layers, data independence, declarative rather than imperative querying, dynamically assembled queries, and gazillions of other things. These are the real features that we mean when we say "database." These features are absolutely necessary. Prevayler includes none of them. It is an extremely trivial persistent HashMap, that's all.

Thus, when the prevayler team says "throw away your database," I must assume one of two things. 1) They're trolling for publicity by saying outrageous and purposefully stupid things. Or 2) They are shockingly, mind-numbingly naive, and they don't know what a database is or what it does.

The author of Prevayler wrote this about himself: "Carlos Eduardo Villela is a 19-year old Brazilian graduate in Information Systems... almost 8 years experience has made him a Java and Python enthusiast."

Thus, I have to assume that the authors are mind-numbingly naive. Don't get me wrong, I'm sure the authors are very bright, and I know that some good insights that went into the implementation of prevayler. But let's not throw away our databases quite yet.
Found a serious bug already by cartman · 2003-09-23 17:31 · Score: 4, Insightful

The somewhat naive authors of prevayler confidently announce the following on their website:

"No one has yet found a bug in Prevayler in a Production release. Who will be the first? [bold in original text]."

I already found a serious bug in the current production release. From the prevayler source:

ObjectOutputStream oos = logStream();
try {
oos.writeObject(command);
oos.reset();
oos.flush();
} catch (IOException iox) {

ObjectOutputStream does not guarantee atomicity. If your command object is larger than the page size of your disk, the "transaction" will take at least two page writes. A software failure between those page writes will lead to "half a transaction" being committed and a subsequent corruption of data. Once data integrity is lost, it is often difficult or impossible to recover. Prevayler has nothing to handle this case. Thus, prevayler does not correctly implement ACID, because it doesn't guarantee atomicity (half a transaction can be committed), consistency (referential integrity would be destroyed in such a case), isolation (this failure wouldn't be isolated to a single transaction) or durability (the problem would only show up upon reloading).

Finding this bug took very little searching. I am apparently the first person ever to find a bug in prevayler. Do I get a prize?