Is the Relational Database Doomed?
DB Guy writes "There's an article over on Read Write Web about what the future of relational databases looks like when faced with new challenges to its dominance from key/value stores, such as SimpleDB, CouchDB, Project Voldemort and BigTable. The conclusion suggests that relational databases and key value stores aren't really mutually exclusive and instead are different tools for different requirements."
It has been suggested before that the life of the relational DB is coming to an end. I must say that while I agree with this statement: -
Relational databases scale well, but usually only when that scaling happens on a single server node. When the capacity of that single node is reached, you need to scale out and distribute that load across multiple server nodes. This is when the complexity of relational databases starts to rub against their potential to scale.
I disagree with the following statement: -
Try scaling to hundreds or thousands of nodes, rather than a few, and the complexities become overwhelming, and the characteristics that make RDBMS so appealing drastically reduce their viability as platforms for large distributed systems.
I submit that the complexity can be managed and that's why we have jobs.
I am an IT consultant at a major bank and we keep all kinds of data. Data that many find useless and is spread across 27 [major] nodes. Total records in our biggest table number about 57 million with 49 rows. I can tell you that data querying and integrity maintaining are a breeze if the schematic design is correct in the first place.
We are always designing and testing different scenarios. In cases where we have had to change the schema, it has been simple if one knows what to do.
I must say that Open Source DBs have worked for us though we rely on products from IBM and Oracle.
Our philosophy is: If it works in PostgreSQL, it will even do wonders on DB2 or Oracle. I do not see how we can do away with the relational DB. Whoever designed it in the beginning did a marvelous job.
The relational database is not going anywhere and nothing in that article is based on any firm understanding of managing data.
Is the notion of a "join" obsolete? No, but it is typically impractical in a high volume system. You would probably use denormalization as a strategy.
Scaling many nodes? OK, you still gotta put your data "in" something.
key/value indexing? yawn. select val from keyvalue_tab where key = foo;
The value can be basically anything, and most "relational" databases have good object support as well as XML, JSON, etc.
So we can establish that a SQL relational database can do *everything* a simpler system can do. Now, think about ALL the things you can do with your data in a real database.
What is the point of using a limited and less functional system? A good system, like Oracle, DB2, PostgreSQL, etc (!mysql of course) will do what you need AND allow you do do more should you be successful.
The problem with data is two fold: Managing read/write/deletes and finding what you are looking for. These problems have been solved. A good database will do this for you. Want to store object? XML, JSON, binary objects, or a specialized database extension works perfectly.
Does that example of a relational DB have a serious error, or is that just me? Why have make key in two tables?
He lost cred right then.
No comprende? Let me type that a little slower for you...
Actually i read TFA, and I just couldnt make sense of the benefits offered by the key value thing. You basically should be able to get the same benefits with a relational database system with a query that does a lookup on a single column index. This would involve searching the b-tree for that column, which would yield a row data address of some sort, to either a linked list of cells or a list of addresses of those cells. Once the single b-tree is done it is then very fast to find the other column values in that row. The b-tree or other index lookup also has to be done with the key value pair, the relational is just a collection of multiple key value indexes.
There is the issue of having a variable number of pieces of data linked to a certain key. But you can do this in relational too. Just create a table with an id column, value type column and value column. A well designed relational, if you do a query on the id column, the b-tree will lead to data which has all of the row data addresses in the database that match the id. EAch of those rows will contain a different data type/data payload for the id. This is again pretty much as fast as a simple single index database.
Wow, um where to being really....
So you realize that the structure you are suggesting can be easily built in a traditional RDB, using a star-schema or cluster design right?
Next you suggest doing the sorting on the client, and then say that if there is more data then a client can handle the server can be asked to send chunks according to the clients sort order. That means the server has to have all the sort logic the client has and probably in all but the most trival applications do all the sorting anyway... Seems to me a star schema and indexing the fact table on the attributes that are most comonly going to be used for sorting makes much more sense; because as I said the serve is going to be sorting anyway.
Now there are data sets that non relational structers do make some more sense, but we have hierarchy , and navigational designes for those, yours is not one of them.
Repeal the 17th Amendment TODAY! Also Please Read http://www.gnu.org/philosophy/right-to-read.html
Over 15 years ago Paradox's query-by-example was light-years ahead of today's soul-killing SQL crap.
QBE grids are nothing more then a UI abstraction of the underlying SQL SELECT statement. In fact, in MS-Access (which has a QBE grid), you can flip between looking at the QBE and looking at the raw SQL SELECT statement.
Sometimes it's faster to do it in raw SQL, sometimes it's faster to setup the query in a QBE grid.
Wolde you bothe eate your cake, and have your cake?
Suggesting that you could replace a MS-SQL server with SQLite basically forces anybody in the know to ignore every other point you make.
MySQL is good, unless you need a highly performent query analyzer.
Postgres is good, unless you need actual replication features.
SQLite is good, if your datastore is less than 1GB.
Oracle is no-doubt a valid replacement and improvement upon SQL Server. And I use MySQL more than any other DB. But you need to hire Percona to get the same performance out of MySQL that you get from SQL Server out of the box.
Suggesting that you could replace a MS-SQL server with SQLite basically forces anybody in the know to ignore every other point you make.
You're assuming that the person using MS-SQL Server knows what they're doing. How do you know it's more than just a glorified Access database?
MySQL is good, unless you need a highly performent query analyzer.
In other words, the query analyzer is slow? Because the queries work well enough.
Postgres is good, unless you need actual replication features.
Like these?
SQLite is good, if your datastore is less than 1GB.
Another quick Google, and we find these limits -- by default, the maximum database size is just under 32 terabytes.
Not that I'm suggesting it's a good choice at that point, especially with multiple processes. But it does make it kind of hard to take you seriously with that kind of imagined limit, unless you're suggesting there's a practical, performance wall after 1 gig.
Don't thank God, thank a doctor!