Beyond Relational Databases
CowboyRobot writes "Relational databases were developed in the 1970s as a way of improving the efficiency of complex systems.
But modern warehousing of data results in terabytes of information that needs to be organized, and the growing prevalence of mobile devices points to the increasing need for intelligent caching on the local hardware.
According to the ACM, the future of database architecture must include more modularity and configuration.
Although no concrete solutions are included, the article is a good overview of the problems with modern data systems."
Some of the biggest problems that "new" database designs have:
1) Overly complex
2) Don't scale
3) Tied to a single platform/implementation
4) Poor performance
It's typical to see all four in a single try!
SQL, on the other hand:
1) Reasonably simple API
2) Scales to very large databsaes
3) Cross-platform/architecture
4) Performs very well.
Given the insane amount of inertia SQL has, it will extend into an object model, rather than be replaced by one. (EG: C/C++)
I have no problem with your religion until you decide it's reason to deprive others of the truth.
Doesn't make it obsolete. "Databases are old and kludgey. Teh suXX0rs for R0xxng H4XX0rs liek me.
Just because people are too stupid to take the time to read and understand the theory and learn the application doesn't mean the technology is no longer relevant.
Of course no solutions are proposed. There are none because relational theory is correct, and appropriate for real database driven applications. Little crap bulletin boards can use MySQL.
Netcraft confirms relational databases are dead!
People,
Have been crying for the need to replace relational databases since the early nineties at least.
We can all see where that got them.
---- Go ahead, mod me down, I'll just post it again and you lose your mod points.
Funny how they never are, eh?
KFG
I didn't RTFA but for my needs
Or the summary
mySQL suits me quite well.
That's nice. It won't handle a multi-terabyte database, though. That's the domain of Terabase, Oracle, and (blech) DB2. It's also what the article is about.
The power of PHP and mySQL is all I need.
And a moped is all you need to get to work. If you want to haul 300 metric tons of rock from point A to point B, you need a dump truck. Again, that's what this article is about.
Back on topic, this entire article is mostly speculative for the moment. A lot of excellent work has been done in OODB and XMLDB designs, but no singular design has yet emerged to solve all our woes. For example, I love the Prevayler concept. It solves a lot of problems, lowers data access times, and provides for complete data security. It also isn't usable or scalable without a lot more design work.
The future will hold some very interesting things, but for now we'll have to keep inventing until we come up with a consolidated solution.
Javascript + Nintendo DSi = DSiCade
That's what Perl is for...
I don't do this for karma, I do it for cash. It's much better.
Quite true. MySQL does very well into the gigabytes. I haven't seen any good evidence of its abilities in handling terabytes of data. Don't get me wrong, I'm a huge fan of the MySQL, but I'm a bigger fan of using the right tool for the job. For your web message board, MySQL works fine. For holding product, sales, distribution, etc. information for, say Levis, it would not.
I don't do this for karma, I do it for cash. It's much better.
In the future, databases will be more "good."
While we can't yet go into detail as to how this will all work, suffice it to say that we have a pretty solid idea what the future holds.
"With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
RFC 1925
Designed in the 1970s, the RDBMS has nevertheless proven to be the cornerstone of Web development three decades later. Thanks to systems like MySQL deployments are surely at record levels.
Essayist Clay Shirky has gone to far as to suggest that MySQL is at the center of a whole new software movement.
In my experience with Web applicaions the chief problem with the RDBMS seems to be that it does not do text indexing and search very well, so I have to keep a second store of data in something like Lucene.
The other major problem is the level of skill required to tune the database to achieve high-performance SQL queries, so hopefully the RDBMS will evolve with more self-configuration capability.
The article, which I only skimmed, actually addresses these two concerns but seems to pooh-pooh the notion of simply refining the existing RDBMS systems. Instead it says " Old-style database systems solve old-style problems; we need new-style databases to solve new-style problems. "
The paper seems awfully squishy on what this means. The clearest I found was a call to "produce a storage engine that is more configurable so that it can be tuned to the requirements of individual applications."
But this call for new highly modular/configurable storage "engines" seems to me to require at least as much fussy care and feeding as a traditional RDBMS. You're just replacing one DBA with another. And throwing out decades of refinement in the process.
The raison d'etre of the RDBMS is to allow the programmer to treat storage as a black box while gaining nifty ACID features. Extending this to text indexing seems logical.
I agree with the sentiments of the posters that SQL is not going anywhere, but I had a question.
As I am designing more and more complex web apps, I am constantly having to think of new, innovative ways to design the tables and databases and am currently making it up as I go. Does anyone have a reccomendation for books/sites that talk about good design proactices, that is not "How to use SQL" and relatively agnostic on the specific brand on DB?
Sorry for the OT post, its just something that has been bugging me for a while
by MARGO SELTZER, SLEEPYCAT
Sleepycat? The guys who make a brain-dead key/value database with no data manipulation or integrity capabilities? Who are they to educate others on the topic of relational databases? (Sleepycat's products are useful tools, but they are not true databases).
while data management has become almost synonymous with RDBMS, however, there are an increasing number of applications for which lighter-weight alternatives are more appropriate.
Ahh, so the proper title of this paper should be: "Beneath Relational Databases" or "Below Relational Databases". Because the relational model is a *complete* model for data storage and manipulation, so if you have a subset of this functionality, you are not "beyond" it.
As argued by Stonebraker, the relational vendors have been providing the illusion that an RDBMS is the answer to any data management need. For example, as data warehousing and decision support have emerged as important application domains, the vendors have adapted products to address the specialized needs that arise in these new domains. They do this by hiding fairly different data management implementations behind the familiar SQL front end. This model breaks down, however, as one begins to examine emerging data needs in more depth.
Well, the mention of Stonebraker's name as an authority on databases is generally an indicater of a content-free paper, but let's be sure we're talking about the same thing: the relational *model* is a *complete* model. There is no other more effective model, in fact as far as I know, there are no other complete models!
So if you want to use the relational model as a foundation to build new database products, go right ahead. If you're talking the same old vendor BS about "post relational" or "XML" (hierarchical) or "object" (network and/or hierarchical), then please shut up!!
My feeling is when he says "in depth", he means "less depth".
As more documents are created, transmitted, and operated in XML, these translations become unnecessary, inefficient, and tedious. Surely there must be a better way. Native XML data stores with XQuery and XPath access patterns represent the next wave of storage evolution. While new items are constantly added to and removed from an XML repository, the documents themselves are largely read-only.
Uh, yes there is a better way: create an XML data type in a relational database with a full set of XML operators. The relational model doesn't care about data types.
I have no interest in giving up the general relational model for a hierarchic model (rejected decades ago as not being general enough) based on a TEXT FILE FORMAT.
Stream processing is a bit of an outcast in this laundry list of data-intensive applications.
I smell Stonebraker.. yes, it's an outcast because stream processing has nothing to do with data storage!!!
Some argue that database architecture is in need of a revolution akin to the RISC revolution in computer hardware
Yes, all these people need to study and understand the relational model which was developed 30 years ago and is still the only complete data model. The relational model can be described in half a page, and consists of a small number of core operations from which any possible data storage and manipulation need can be developed. Stop thinking about implementations, think about the *model* and then use that develop new implementations!!
Old-style database systems solve old-style problems; we need new-style databases to solve new-style problems.
What does this mean exactly? I need to store and manipulate data without limitations. The relational model offers this. What is "old" or "new" here? I'm not going to switch to an ad-hoc subset of the relational model because it's "new".
This "paper" (wasn't there one a couple weeks from some Microsoft dude, which was equally useless?) commits the same old sins: 1) look at existin
We've already got RDBMS tech - why reinvent an inadequate version of it?
Because current RDBMS designs are unsuitable for filesystems. Relational theory still holds (just as it does for OODBs), but the physical design should be quite different if it's going to be effecient.
As I said, this has been beaten to death in the research communities. BeOS even included a DBFS design, but it went largely unused. NTFS also has all the necessary stuff in it, but Microsoft constantly removes it in final releases. ReiserFS has DBFS features, but these also go largely unused.
I think the problem is that making effective use of a DBFS requires a very different set of applications. i.e. If the applications are aware of the functionality, then they can assist the user and provide useful support. But without this form of OS and application support, the user will find that the metadata is nothing but added confusion.
Javascript + Nintendo DSi = DSiCade
AS/400's are hopelessly complex even to seasoned IT professionals such as myself, and they're only around is not because people like them but because the work SO GOD DAMNED WELL :)
point being, tractors work well to but you dont drive one day to day do you? :)
Religion is a gateway psychosis. -- Dave Foley
Here is the problem with your idea. Unlike the relational model, XML does not link facts. XML documents can be joined in any way, either valid or invalid, without you knowing one way or another. The relationships between documents are weak. There is no referential integrity. Within a proper relational model you are stating facts and factual relationships. Joins of those facts generate derived facts that are as true and accurate as your original model. Why add the overhead and complexity of xml? Why not just use a proper relational model?
mp3's are only for those with bad memories
1) a table is *not* a class or an object.
Good advice. A table is simply a snapshot of a relational value. True relational values have no top/bottom or left/right ordering so you can't really show them on paper or on the screen. It doesn't make sense to map classes or objects to tables.
2) Learn how to normalize. A badly (or flat out not) normalized database threatens data integrity by violating the once-and-only once rule.
Normalization has nothing to do with integrity per se. You can add constraints to a denormalized database that make it logically equivalent to a normalized one. I'm not aware of any "once and only once" rule in relational theory.
Normalization is about dependencies: your data shouldn't have certain types of dependencies in it.
As a rule of thumb if the table has more than 20 fields in it you should review your data model and make sure it is properly normalized.
Wrong, the number of fields has nothing to do with normalization. And you should review your data model even if it has 1 field.
Ditch Raid 5. 0+1 will give better perfomance in most cases. Manager like Raid 5 because it is cheap, you get what you pay for.
Yes, 0+1 is faster, but lots of RAM is even faster. However this is a physical detail and not anything to do with the relational model.
6) Learn a little theory, it won't hurt you. In fact it can save a large amount of time and trouble.
This should be point #1 with a double underline. The lack of understanding and even hostility toward theory is frightening in the IT industry. If the author of the paper above had any foundation knowledge her paper would be much different.
8) Avoid XML. Too much bloat.
Just remember that XML is FILE FORMAT like Jpeg, CSV, and you'll do fine.
9) Learn how to use indices on tables.
Christ, I hope people know this one! They should also realize that an index is a PHYSICAL manifestion of a logical idea: keys.
If full 3rd normal form is too slow, you need better hardware.
If a few joins (assuming you've got indexes, etc setup) make it too slow you are too close to the edge.
Saving on hardware but spending more in dealing with data problems is false economy.
Just because it CAN be done, doesn't mean it should!
JDBC (probably ODBC too, tho haven't used it in eight years) helps to standardize key generation (in JDBC 3.0 FINALLY), and Date processing (christ, date functions are so annoying). Most other operations can usually be done by the platform language that is processing the data, so you can avoid the tie-in that results from various SQL dialects' built-in functions. XML databases were a total flop, and so were object databases. I agree, SQL engines are so mature now that I don't see any database tech replacing them for another ten years. By the way, can we get PostGres (and now Oracle's) support of regular expression LIKEs standardized? And can we please get JDBC to support something like: INSERT $price INTO pricetable where productid = $productid rather than using ?'s and counting which ? to set values to? Hibernate's query language does this, and I really like it.
Hey, I'm just your average shit and piss factory.
An XML doc is a tree, thus it is hierarchically organized data. There have been hacks to try to extend around this limitation, but relational data still has superior flexibility.
that's why XML databases flopped
Hey, I'm just your average shit and piss factory.
Let me ask you this: How often do you see an OSS product (EG: phpwiki) that doesn't offer support for numerous databases?
Quite a bit actually. In my experience most only support MySQL. Why? They don't understand the value of data integrity and the features MySQL lacks/lacked. As a result they have had to poorly implement things that the DB itself should do, and when they try to "port" this to another DB they run into all sorts of issues.
And SQL is a language, not an API.
So they give up.
My Suburban burns less gasoline than your Prius.