Slashdot Mirror


Collaborative Map-Reduce In the Browser

igrigorik writes "The generality and simplicity of Google's Map-Reduce is what makes it such a powerful tool. However, what if instead of using proprietary protocols we could crowd-source the CPU power of millions of users online every day? Javascript is the most widely deployed language — every browser can run it — and we could use it to push the job to the client. Then, all we would need is a browser and an HTTP server to power our self-assembling supercomputer (proof of concept + code). Imagine if all it took to join a compute job was to open a URL."

2 of 188 comments (clear)

  1. Random Thoughts by AKAImBatman · · Score: 5, Interesting

    Two comments:

    1. He places the map/emit/reduce functions in the page itself. This is unnecessary. Since Javascript can easily be passed around in text form, the packet that initializes the job can pass a map/emit/reduce function to run. e.g.:

    var myfunc = eval("(function() { /*do stuff*/ })");

    In fact, the entire architecture would work more smoothly using AJAX with either JSON or XML rather than passing the data around as HTML content. As a bonus, new types of jobs can be injected into the compute cluster at any time.

    2. Both Gears and HTML5 have background threads for this sort of thing. Since abusing the primary thread tends to lock the browser, it's much better to make use of one of these facilities whenever possible. Especially since multithreading appears to be well supported by the next batch of browser releases.

    (As an aside, I realize this is just a proof of concept. I'm merely adding my 2 cents worth on a realistic implementation. ;-))

  2. Pay Me by Doc+Ruby · · Score: 4, Interesting

    If there were a couple-few or more orgs competing to use my extra cycles, outbidding each other with money in my account buying my cycles, I might trust them to control those extra cycles. If they sold time on their distributed supercomputer, they'd have money to pay me.

    As a variation, I wouldn't be surprised to see Google distribute its own computing load onto the browsers creating that load.

    Though both models raise the question of how to protect that distributed computing from being attacked by someone hostile, poisoning the results to damage the central controller's value from it (or its core business).

    --

    --
    make install -not war