Slashdot Mirror


Swarm — a New Approach To Distributed Computation

An anonymous reader writes "Ian Clarke, creator of Freenet, has been working on a new open source project called Swarm. The concept is to allow a computer program to be distributed across multiple computers in a manner almost completely transparent to the programmer. The system observes the program executing and figures out how the workload should be distributed for maximum efficiency. Swarm is implemented in Scala. Its at an early-prototype stage, and Ian has created a good 36 minute video explaining the concept and the current implementation."

25 of 80 comments (clear)

  1. Earlier by WetCat · · Score: 4, Interesting

    .. was Mosix http://www.mosix.org/
    It allowed mosix-running linux computers to distribute their loads over a connected other mosix-running linux computers.
    Processes migrate to other nodes transparently. No programming changes were needed.

    1. Re:Earlier by K.+S.+Kyosuke · · Score: 3, Informative

      And this one works at the application level, across various OSes. No computer repurposing and reinstalling is needed.

      --
      Ezekiel 23:20
    2. Re:Earlier by mrmeval · · Score: 3, Informative

      It's not free software so you can't use it except for personal or educational use. Open Mosix died
      http://openmosix.sourceforge.net/

      --
      I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
  2. Name... Neat idea though by Anubis350 · · Score: 2, Interesting

    At first I thought they were talking about Swarm, a "attempt to gather up many different kinds of models that go under the heading of "agent-based modeling" and create a common language and programming approach." that I've worked with before. I'm surprised they went with the name of an established toolkit in another aspect of programming. Still, looks like a cool tool, another layer of abstraction to make distributed computing easier might make it more attractive to those that don't use it much at the moment.

    --
    "goodbye and hello, as always" ~Prince Corwin, from Zelazny's Amber series
    1. Re:Name... Neat idea though by hazem · · Score: 2, Insightful

      I'm just getting into Agent Based Modeling myself and I had exactly the same thought... why would they use the name of an established tool; especially when there are similarities in the concepts. This seems like a recipe for confusion.

      A good first check when starting an open project is to check propesedprojectname.org and see if there's anything active there. Or even just Google it - if another project shows up near the top with the same name, it's probably a good idea to pick another name.

      I'm sure there are plenty of synonyms for "swarm" that capture the idea, if not an alternate spelling.

      But like you said, it does sound like an interesting project.

    2. Re:Name... Neat idea though by Anonymous Coward · · Score: 2, Informative

      From the FAQ:

      Did you know that there are other projects called "Swarm"?

      Yes we did. We do not believe that 100% uniqueness is a prerequisite for a project name. Remember that the word "swarm" has been in use for over a thousand years, it wasn't invented by any software project!

      Our opinion (born of painful past experience) is that it is better to have a good non-unique name than a bad unique name. Of course, if someone can suggest a good unique name, we'll give it serious consideration.

  3. Re:This'll be great for botnets by K.+S.+Kyosuke · · Score: 2, Funny

    I'm sure you would notice an apparently suspicious huge JVM process eating your CPU time. :]

    --
    Ezekiel 23:20
  4. Re:This'll be great for botnets by Darkness404 · · Score: 4, Insightful

    You know though, most people don't ever check that. They think that over time Windows just "gets slow" because hardware "goes obsolete". So when that happens they think they have to buy a new computer.

    --
    Taxation is legalized theft, no more, no less.
  5. Obligatory by arcsimm · · Score: 4, Funny

    Imagine a Beowulf cluster of... err. Oh.

  6. looks intriguing by Trepidity · · Score: 5, Insightful

    The thing that's always killed this idea (along with automatic parallelization even on the same machine) is that the overhead of figuring out what's worth distributing, and the additional overhead from mistakes (accidentally distribute trivial computations), often swamps the gains from the multiple processors banging away on it simultaneously. Determining statically what's worth distributing is very hard, since solving it properly is undecidable (basically equivalent to the halting problem), and even solving it in a significant enough subset of cases to be useful has proved difficult. It looks like this project is monitoring dynamically to determine what to distribute, which seems likely to be more fruitful, although historically that approach has suffered from the overhead of the monitoring (like always running your code with debugging instrumentation turned on).

    I certainly hope he has a breakthrough vs. past approaches, or it could just be that advances in a lot of areas of technology have given him a better substrate on which to build things that naturally mitigates lots of the problems these things used to have (automatic parallelization research started probably ahead of its time, back in the 1970s, so that most academic stuff was killed off by the 1990s after no really knock-down results emerged). It's not entirely clear to me what the killer advance is, though. The particular variety of portable continuations? A good way of easily monitoring computations? Something that makes the data-dependency analysis particularly easy?

    1. Re:looks intriguing by djupedal · · Score: 2, Interesting

      > The thing that's always killed this idea (along with automatic parallelization even on the same machine) is that the overhead of figuring out what's worth distributing

      That kind of thinking is so 90's. Brute force data mining, as an example means harvest it all and let target groups sort out what they want. It is a waste of time to 'decide'. That's like stopping to inspect every shovel full of ore as it comes out of the ground. All or nothing has been the default for some time now, and this is just another example.

    2. Re:looks intriguing by FlyingBishop · · Score: 3, Insightful

      Depending on how many cores you have access to, distributing trivial computations may not matter. If we ever start seeing 32 core desktop machines, for example, you start to get to the point where forking could create a realtime speedup even though in absolute terms you've wasted 5 times as many cycles.

    3. Re:looks intriguing by david.given · · Score: 4, Interesting

      A friend of mine did a system like this about ten years ago --- hi, Iain! --- called Flit. It had a number of the same features, although using a custom language; it had some rather interesting concepts, such as asynchronous function calls that would return immediately, spawning a new thread, but return a future: a value whose value was not known yet. Accessing the value would cause the thread to be waited upon.

      Unfortunately the killer problem that sunk Flit was that of distributed garbage collection. Collecting data over multiple machines is really, really hard, and he never found a usable approach to make it work. I was very disappointed to see that Swarm's garbage collection is still on the to-do list --- he doesn't appear to have started to think about it yet.

      I hope he can make Swarm work --- it's something that we could all definitely use. But there are fundamental theoretical problems that have to be solved first...

  7. Sounds good by cwire4 · · Score: 2, Interesting

    It sounds like a good idea, but I don't think the project is far enough along in this video to warrant a posting. Maybe he was using too much of a trivial example to be appreciated in the video, but his explicitly offloading the task to another computer doesn't appear to be very far beyond standard client server models. If it were already automatically transporting processing between different nodes, it'd be much cooler, but that is not a trivial problem to solve. Deciding what should and what shouldn't be distributed at the application level will be extremely hard I imagine. If the project were farther along in its maturity I'd be much more interested.

  8. We have just witnessed the birth of a new meme... by Linker3000 · · Score: 3, Funny

    In Ian Clarke's Swarm, World "Hellos" you!

    --
    AT&ROFLMAO
  9. Re:This'll be great for botnets by NoYob · · Score: 2, Interesting

    Yeah. Visit a website with an applet and you get the JVM startup and it stays up and running even after you leave the website and visit websites that don't have applets. In other words, I probably wouldn't notice at first either and I'd be chugging along until I restarted my machine and saw the JVM pop-up again for no apparent reason. Other folks who have no idea wtf a JVM is would never notice.

    --
    It's NOT me! It's the meds! I'm on 1000mg of Fukitol.
  10. Re:This'll be great for botnets by Linker3000 · · Score: 2, Funny

    Nope, I already use OpenOffice!

    --
    AT&ROFLMAO
  11. I doesn't do much yet by svick · · Score: 4, Informative

    If I understand what he says correctly, it is something like this: Distributing computation is hard, really hard. It's so hard that nobody ever did it properly. But Swarm will change this! How? Well, we don't know yet, there are so many interresting problems we have to solve first. And you can help!

    1. Re:I doesn't do much yet by Anonymous Coward · · Score: 4, Insightful

      Mod parent up. This is exactly what Ian did with Freenet.

      He cobbled together an overly-simplistic prototype to address a set of very difficult unsolved problems in anonymous communication and then farmed out the actual real-world legwork on those problems to interested open source developers while Ian himself effectively abandoned Freenet for other (paying) gigs. To this day he is credited, somewhat ironically, as "the creator of Freenet," and a decade later the Freenet project still hasn't solved the problems it set out to solve, even after changing the fundamental network architecture several times.

      Great career strategy though. Get credit for the shiny things and pass the shame of failure off on others. He's CEO material all the way.

    2. Re:I doesn't do much yet by Hobbex · · Score: 2, Interesting

      I don't think this characterization is fair, and I think you would have a hard time finding somebody who actually worked on Freenet to agree with you. Ian's orginal technical ideas for Freenet - as well as his vision - are very much still a big part of the architecture, and he could never be said to have abandoned it. In fact, time has vindicated many of his ideas to a far greater extent than I expected when we started working with them. You are right that the project has not yet solved the problems it set out to solve - but since it has wildly high ambitions, that should hardly be surprising. I think it has made a positive contribution all the same, if only to our understanding of many of the issues involved.

      It is true that the press has had a tendency to paint Ian as the lone father of the project, but that is just the way to press works, and I have never seen Ian taking credit for other peoples work. And, to be honest, after you have done it a few times, you start realizing that dealing with the press isn't nearly as fun as it is cracked up to be, and that Ian has a knack for communication that most nerds, myself included, do not. I think Freenet has been very well served by Ian's ability to effectively communicate it's goals and gain attention -- among other things it has allowed several coders, of whom I was the first but not the last, to work full time for the project for certain periods. That said, I was a bit disappointed when the NYTimes ran a cover story on a presentation Ian and I held at Defcon and forgot to mention me at all, but I got over it :-).

  12. Re:This'll be great for botnets by jeisner · · Score: 2, Informative

    That's true to some degree. But computers do slow down as they age. Components damaged by the constant heating cause more errors and therefore require retransmission or error correction, slowing things down.

    My Dell desktop from 1999 has been running like the wind again since last week, when I reverted it to its 2002 state from backup tape. It goes superfast now that it's virus-free, off the network, and running old apps on Windows 98.

    I was only trying to recover some old files before junking an unusable machine, but I may keep it around now as a non-networked machine for the kids.

  13. Alpha code... by Linker3000 · · Score: 3, Funny

    Computer 1: MOV AL...what? No more? MOV AL what? I need a value! WTF am I supposed to do with that!?

    Computer 2: 09? Nine? Who gave me nine on its own. That doesn't make any sense! Jeez! Hey, anyone out there missing some data?

    Computer 3: Not me, I'm pushing the registers onto the stack

    Computer 4: Nope, I've got an INT

    Computer 5: Oh, hey, it could be me - does NOP have a value. No? Sorry, my bad!

    Computer 1: Nine - yeah, nine - Well, I could stick that in AL if no-one else wants it!?

    Computer 3: Oh, heck, give it to 1. I've just got a POP instruction so I am going to obliterate it anyway.....

    --
    AT&ROFLMAO
  14. Re:This'll be great for botnets by BikeHelmet · · Score: 3, Insightful

    In my experience, Java is not the reason people buy new computers.

    Their computers slow down from viruses, or virus-like Antivirus, and then they think they need to upgrade.

    Lately commercially made programs (AIM? Windows Live stuff? Most printer software? Most shareware?) seem to consume as much memory as a whole JVM, despite being written in C. This has led me to conclude that companies really don't give a shit how much memory their software uses. This is quite ironically pushing Java closer and closer to C in actual memory and CPU usage.

    Disclaimer: I know C is amazing when used properly - but it seems like only small FOSS projects and apps destined for phones have any sort of optimization work done. I've seen daemons use 200KB on a tiny linux handheld, but multiple megabytes is the norm on any desktop.

  15. Re:This'll be great for botnets by ScrewMaster · · Score: 2, Interesting

    That's true to some degree. But computers do slow down as they age. Components damaged by the constant heating cause more errors and therefore require retransmission or error correction, slowing things down.

    No, not really. PCs are nowhere near that sophisticated. A high-speed CPU bus is not like a DSL connection. Pretty much it has to work near-perfectly, or it's blue-screen city.

    For example, I have a couple of Athlon 1.4 ghz machines that are running just as fast as the day I built them, and they've never been turned off. Also have an old Thinkpad R41 ... still as fast as it ever was (faster, actually ... I have it running a stripped-down version of XP.) If you have a motherboard or PC that is getting errors due to heating what you're going to see are crashes and lockups, not slowdowns. Personal computers are not mainframes or minicomputers: even with ECC memory they are not fault tolerant to any significant degree, and frankly I think it's a wonder they work as well as they do (Windows issues aside.) When a component starts generating errors your average PC just breaks ... if you're lucky it's just the faulty subsystem, but if you're not the machine is toast.

    People's machines slow down because a. they never defrag their hard drives and b. they get infected. It just takes a single badly written piece of malware to turn an otherwise decent machine into a 386, yet users frequently blame the hardware for being too old, as if that somehow explains poor performance. Many people are completely amazed when I clean up their system for them and pack the hard disk. "Wow, it's like a whole new computer!" No, dimbulb, it's the same computer you've always had, you were just too lazy to give it even minimum maintenance. I'm glad I'm not in IT: it's a lot like being a doctor. You have to deal with people who have no ability to think rationally about their problems, and even when you give them good advice they never follow it anyway.

    --
    The higher the technology, the sharper that two-edged sword.
  16. Another Earlier - ERLANG! by mcrbids · · Score: 4, Informative

    Erlang apparently gets it right. Scales smoothly from single core to multi-core to multi-server in a near linear fashion. Astonishingly reliable, having achieved nine nines of uptime - much less than a second of downtime - in a year. Purposely designed to mitigate shared memory problems. Built for hot-switchover - you can upgrade Erlang problems without closing them first!

    In just about every conceivable way, Erlang is the right choice for high-end multi-core multi-system clustered application development. I have a large-stack, clustered application written in PHP. While it works well, there are limits to what we can do within a single process - a problem that's likely to become worse over time as needs continue to scale up. If I were to do it all over again, I'd take a good, hard, look at Erlang.

    --
    I have no problem with your religion until you decide it's reason to deprive others of the truth.