Researchers Create Database-Hadoop Hybrid
ericatcw writes "'NoSQL' alternatives such as Hadoop and MapReduce may be uber-cheap and scalable, but they remain slower and clumsier to use than relational databases, say some. Now, researchers at Yale University have created a database-Hadoop hybrid that they say offers the best of both worlds: fast performance and the ability to scale out near-indefinitely. HadoopDB was built using PostGreSQL, though MySQL has also successfully been swapped in, according to Yale computer science professor Daniel Abadi, whose students built this prototype."
Uber-cheap is not a word, and it doesn't even make sense because you're saying it's "above cheap". Stop making up stupid shit.
It's PostgreSQL... but I sympathize with the mixed case confusion and refer you to this Postgres vs PostgreSQL permathread.
The Army reading list
If both the performance and scalability is as good as described I can safely say that this is the most important thing of the decade and not only for DBMS.
Handling large portions of data would get cheaper by an order of magnitude at least and scaling out would be way cheaper than now as well. I do hope it's true.
I thought Essbase was supposed to be one of the best databases for managing too much information. Is this supposed to be an alternative, or act as something in-between using Essbase and a mysql server?
It won't deliver. In the mean time for those of us living and working in the real world, hard-drives will be bigger and faster, file systems will get better, and SSDs will start to shit all over spinning platters.
The grad students do all the work, and the professor takes all the credit. Anyone can come up with ideas, the real work is in actually getting things done. This is the reason I stopped grad school with my MS even though I LOVE computer science, more than anyone i've ever met.
My blog
Scalability is one thing, but what we appreciate in SQL-free databases is also that they don't require SQL.
When what we want is just to retrieve a record, calling get(id) is way easier and more secure than building an SQL statement, and way cheaper than using an ORM.
The Tokyo Cabinet API is absolutely excellent in this regard. And there's no need to learn yet another domain-specific language like SQL, just use the language you use for the rest of the app.
Now, SQL-zealots would troll "but how would you do with ?".
And yes, for complex requests as in data mining, SQL and XPath make sense. For people who aren't developpers, SQL makes sense as well. For interoperability with 3rd-party apps, SQL is also useful, just as FAT is still useful today in order to share filesystems between operating systems.
But for the rest of us, SQL is cumbersome. Databases like MongoDB make you achieve similar results in a more natural way instead of forcing you to learn SQL and to rethink everything in a tabular way.
{{.sig}}
It it will deliver it will change much. Not for your average blogger with a $10 hosting, wordpress and all his 100 readers but for all the folks that have sites successful enough to go beyond that a single DB server can deliver. Now you have to work really, really hard to make it all work with replication as pretty much no free CMS offers data sharding. Now you won't have to. Just get a DB cluster (as a service) that works out of the box with none/very little modification to the software you are using. The wall that they currently hit at the point they have to invest loads of money to continue growth will be gone.
No offense to the creators (well, maybe some offense) but why the heck would you want to put MySQL in where PostgreSQL already was? That's like taking out your star quarterback and putting in, well, me!
"!"
I can't say I'm looking forward to bigger, faster, shit-covered platters...but hey. Who am I to stand in the way of progress?
THL phish sticks
(In my best Special Ed impersonation)
Yaaaaaay, now we can scale out Hadoop! Yaaaaay! Yaaaay Hadoop! Yaaaaay!
We might create the software intending it to do and be used in one way, but how it will actually be used is determined by the users. Postgre and MySQL don't carry any intrinsic values, only the values which their users discover and, well, use. Without users they have no good or bad features.
So why is it that people feel the need to rally around or defend them? After all, only the developers who have done the work are capable of understanding the snips and criticism leveled against them, and these are the people who have given their work away, to you and me.
MySQL excels at some things. Postgre also excels at some things. If users feel there is too much overlap then they can work to reproduce these features in a single tool, such as Postgre should they feel it has more utility. But to discount a tool many people find useful shows a core misunderstanding of what it is that determines the software's value.
Postgre can not be better then MySQL, it can only provide varying degrees of value. And that value is determined by the user.
Quack, quack.
There are also two Hadoop subprojects that either support SQL or will shortly. They both translate SQL queries into map/reduce programs. They are:
http://hadoop.apache.org/pig/
http://hadoop.apache.org/hive/
Uber and Super both mean "above", knucklehead. Same proto-indo-european root, in fact.
Today may just be the day that you learn that a word may have more than one definition. In fact, the word you use "root" refers not just to a word's origin, but it can also refer to a very important part of plants. Do not squander this opportunity. It will open an entire new world of linguistics. I have nothing but hope for the grand future that awaits you and your once-tunneled view of the English language.
I am the richest astronaut ever to win the superbowl.
And how much more than "Free" does it cost?
Well.. maybe. Or Maybe not. But Definitely not sort of.
Not sure what you were talking about, but hadoop and postgres are open source. Unless they're stupid, they wouldn't make the resulting product closed source.
I'm not going to make the whole free software pitch here, but lets just say I believe in the superiority of the development process and the end product through my experiences developing and using software.
I have no confidence in Intersystems Cache's long term survival.
Well.. maybe. Or Maybe not. But Definitely not sort of.
"Cheap 2.0".
But for the rest of us....
Sorry, but could not help thinking but to this line from "Life of Brian":
But apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, the fresh-water system, and public health, what have the Romans ever done for us?
More seriously, if the main example of the trouble with SQL is that you want to be able to find a record by id with less keystrokes, I do not see how this can be so much of a problem.
Why can't