MapReduce Goes Commercial, Integrated With SQL
CurtMonash writes "MapReduce sits at the heart of Google's data processing — and Yahoo's, Facebook's and LinkedIn's as well. But it's been highly controversial, due to an apparent conflict with standard data warehousing common sense. Now two data warehouse DBMS vendors, Greenplum and Aster Data, have announced the integration of MapReduce into their SQL database managers. I think MapReduce could give a major boost to high-end analytics, specifically to applications in three areas: 1) Text tokenization, indexing, and search; 2) Creation of other kinds of data structures (e.g., graphs); and 3) Data mining and machine learning. (Data transformation may belong on that list as well.) All these areas could yield better results if there were better performance, and MapReduce offers the possibility of major processing speed-ups."
Data warehousing (here I mean databases stored in column order for faster queries, etc.) may get a lift from using map reduce over server clusters. This would get away from using relational databases for massive data stores for problems where you need to sweep through a lot of data, collecting specific results.
I think that it is interesting, useful, and cool that Yahoo is supporting the open source Nutch system, that implements map reduce APIs for a few languages - makes it easier to experiment with map reduce on a budget.
http://www.databasecolumn.com/2008/01/mapreduce-a-major-step-back.html
then they embrace it
Intron: the portion of DNA which expresses nothing useful.
In functional programming map and reduce is very very old knowledge (and, yup, functional programming has its use and, yes, there are some very good and very successful programs written using functional languages).
What's next? A product called DepthFirstSearch (notice the uber broken camel case for a product name) that has nothing to do with the depth-first search algorithm?
Google? Allo?
I don't think you can credit Bjarne with "compiled code is faster than interpreted code" (or the 21st century version: "compilers can perform better optimizations that JIT translators").
C++ happens to be the most popular fully compiled language, having edged Fortran out of that position some time near the end of the last century.
Back in the early '80s, when he was coming up with C++, the big Fortran savants were saying stuff like "Fortran is bigger than ever. There are more than X million Fortran programmers. Everywhere I look there has been an uprising... a lot of teaching was going to Pascal, but more are teaching Fortran again. There has been a backlash."
----
And that's not the only thing C++ has in common with Fortran, either.
If Java ( or Pyhton etc. for that matter ) were fast enough why did Google choose C++ to build their insanely fast search engine.
Because their developers knew it better? Because it had better 64-bit support when they started it? Because full GC's weren't compatible with their use case and IBM's parallel GC VM hadn't been released yet? Because they could get and modify all the source to all the libraries?
I don't know the answer, but there are a lot of possibilities besides speed. You're jumping to an awfully big conclusion there, Mr. Coward.
E pluribus unum