Slashdot Mirror


MapReduce Goes Commercial, Integrated With SQL

CurtMonash writes "MapReduce sits at the heart of Google's data processing — and Yahoo's, Facebook's and LinkedIn's as well. But it's been highly controversial, due to an apparent conflict with standard data warehousing common sense. Now two data warehouse DBMS vendors, Greenplum and Aster Data, have announced the integration of MapReduce into their SQL database managers. I think MapReduce could give a major boost to high-end analytics, specifically to applications in three areas: 1) Text tokenization, indexing, and search; 2) Creation of other kinds of data structures (e.g., graphs); and 3) Data mining and machine learning. (Data transformation may belong on that list as well.) All these areas could yield better results if there were better performance, and MapReduce offers the possibility of major processing speed-ups."

1 of 99 comments (clear)

  1. Re:Um, first question: WTF is MapReduce? by moderatorrater · · Score: 0, Troll

    Map reduce: a framework for taking a problem and breaking it up into smaller pieces. As I understand it, Map is the program that decides which server the data gets sent to, Reduce is the program that actually processes it. For google, when you write a query, they send the query to several different servers. Those servers then search their subset of the internet for that term, rank them, and return them. The central server then combines those results and returns them to the user. In this case, the Map program would send the request to the servers and be smart enough to make sure that you don't get duplicate servers. The Reduce program is the one that does the searching and sends them back.