Is the Relational Database Doomed?
DB Guy writes "There's an article over on Read Write Web about what the future of relational databases looks like when faced with new challenges to its dominance from key/value stores, such as SimpleDB, CouchDB, Project Voldemort and BigTable. The conclusion suggests that relational databases and key value stores aren't really mutually exclusive and instead are different tools for different requirements."
that's efficient -a summary that refutes the inflammatory headline
I'm just sayin'
Someone forgot to put a where clause on that delete.
Reading slashdot one-liner: (irm http://rss.slashdot.org/Slashdot/slashdot).rdf.item | fl title,desc*
Someone type this up and submit it to Digg.
This same basic story keeps getting submitted from the same group of people who are generally trying to sell non-relational-DB stuff. This is an ad. Move along.
Especially when the claim is as ridiculous as this one.
There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data. Which is what databases are supposed to do.
I am officially gone from
Leave us RDBMS dinosaurs alone. String Name/Value pairs, that is a great innovation. In other news, Sun will be dropping all types from the Java object system and rely on the VOID type. Idiots.
The name might be cool; but the length of some of the commands will really get to you. How many times do you want to type AVADA_KEDAVRA TABLE?
Map db = new HashMap();
beginTransaction(); // Synchronize on the map // Just serialize the fucker to a file. The idiots using this won't know the difference.
db.add("key", "value");
commitTransaction();
I won't believe it until Netcraft confirms it.
I have something in common with Stephen Hawking...
It has been suggested before that the life of the relational DB is coming to an end. I must say that while I agree with this statement: -
Relational databases scale well, but usually only when that scaling happens on a single server node. When the capacity of that single node is reached, you need to scale out and distribute that load across multiple server nodes. This is when the complexity of relational databases starts to rub against their potential to scale.
I disagree with the following statement: -
Try scaling to hundreds or thousands of nodes, rather than a few, and the complexities become overwhelming, and the characteristics that make RDBMS so appealing drastically reduce their viability as platforms for large distributed systems.
I submit that the complexity can be managed and that's why we have jobs.
I am an IT consultant at a major bank and we keep all kinds of data. Data that many find useless and is spread across 27 [major] nodes. Total records in our biggest table number about 57 million with 49 rows. I can tell you that data querying and integrity maintaining are a breeze if the schematic design is correct in the first place.
We are always designing and testing different scenarios. In cases where we have had to change the schema, it has been simple if one knows what to do.
I must say that Open Source DBs have worked for us though we rely on products from IBM and Oracle.
Our philosophy is: If it works in PostgreSQL, it will even do wonders on DB2 or Oracle. I do not see how we can do away with the relational DB. Whoever designed it in the beginning did a marvelous job.
In headlines, "?" implies that something is a serious question, whose answer is likely to be yes. One that makes it worth spending the time to read the article.
Imagine the headline said "Does Obama Smoke Crack?" and the article had a bunch of stuff about the president, with a last paragraph saying: "There is absolutely no reason to thing that President Obama has ever smoked crack."
-- Support a free market in the field of government
The name might be cool; but the length of some of the commands will really get to you. How many times do you want to type AVADA_KEDAVRA TABLE?
Better than PokemonDB. Then you have to jump on top of your desk and shout "Customer Table, I select you!" every time you run a damn query.
Kwisatz Haderach
Sell the spice to CHOAM
This Mahdi took Shaddam's Throne
You're right, that is a bit cumbersome. Hopefully, they'll release a friendly GUI wizard to make working with it more efficient.
There's a reason relational databases took over the world of databases: They provide a good combination of flexibility and structure to efficiently represent data.
Especially since so many databases really are inherently relational. The textbook example of 1-customer:n-invoices, 1-invoice:n-items plays out quite a bit in the workplace.
Dewey, what part of this looks like authorities should be involved?
In theory, I agree the most costly actions in a database are joins. It seems like the key/value model is a great solution to this, on the surface. However, what the key/value model does is push the cost to the application layer. Instead of ensuring relational integrity and conformity in the database, suddenly all app code has to do this on the frontend. Also, instead of managing this process in a single place, suddenly this process is distributed among multiple methods. Sure, the DB is more scaleable, but suddenly the app is a mess.
Actually i read TFA, and I just couldnt make sense of the benefits offered by the key value thing. You basically should be able to get the same benefits with a relational database system with a query that does a lookup on a single column index. This would involve searching the b-tree for that column, which would yield a row data address of some sort, to either a linked list of cells or a list of addresses of those cells. Once the single b-tree is done it is then very fast to find the other column values in that row. The b-tree or other index lookup also has to be done with the key value pair, the relational is just a collection of multiple key value indexes.
There is the issue of having a variable number of pieces of data linked to a certain key. But you can do this in relational too. Just create a table with an id column, value type column and value column. A well designed relational, if you do a query on the id column, the b-tree will lead to data which has all of the row data addresses in the database that match the id. EAch of those rows will contain a different data type/data payload for the id. This is again pretty much as fast as a simple single index database.
no, the relational database is not going anywhere, you are correct. but, that does not mean that there aren't instances where a non-relational database, with the addition of map/reduce, aren't extremely useful.
non-relational databases have been around for decades, and are in use for quite a number of applications involving rapid development and storage of very large records. couple this with map/reduce, and you have the ability to scale quickly with very large datasets.
scaling quickly is a very difficult problem to solve with an RDBMS - you either need to continue to throw more hardware at the problem, to the point of diminishing returns, or re-architect your data at the cost of possible significant downtime, while still attempting to serve up the data in a timely manner. i've been deep in the bowels of oracle RAC, fighting to get just 5% more speed out of a query over a billion rows and realizing that i have to start over with a new schema, just to squeeze more data out. compare that to simply adding another machine and letting the map functionality run across one more cpu before returning it for the reduce.
once again, correct, but having to denormalize to a snowflake or a star isn't always the best solution. you're taking the best parts of the relational database model, and throwing them out - normalization, referential integrity, just to squeeze more out of something that may not be the best tool for the job.
do you hammer with a wrench? i have before, and i managed to hurt my thumb.
Yes, these newer simple key/value databases like BigTable and CouchDB are effectively a subset of RDBMS functionality, so of course the same thing can be implemented relationally by just not using features.
The reason these projects have taken off is that the relational features being skipped comprise most of the complexity of an RDBMS. Without them, it's relatively trivial to write new database engines from scratch instead of re-using MySQL, PostgreSQL, and so-on. These new feature-poor rewrites can take on many challenges that are harder for the big relational guys, like stellar performance on huge datasets, and being truly distributed in nature.
11*43+456^2