Slashdot Mirror


Hosting Problems For distributed.net

Yoda2 writes "I've always found the distributed.net client to be a scientific, practical use for my spare CPU cycles. Unfortunately, it looks like they lost their hosting and need some help. The complete story is available on their main page but I've included a snippet with their needs below: 'Our typical bandwidth usage is 3Mb/s, and reliable uptime is of course essential. Please e-mail dbaker@distributed.net if you think you may be able to help us in this area.' As they are already having hosting problems, I hate to /. them, but their site is copyrighted so I didn't copy the entire story. Please help if you can." Before there was SETI@Home, Distributed.net was around - hopefully you can still join the team.

23 of 210 comments (clear)

  1. Distributed hosting? by gnovos · · Score: 5, Interesting

    Maybe they should go in for distributed hosting, like say one machine that just houses the IP address and a few thousand mirrors that the requests can be directed to as they come in. Not only is it a project that is just ASKING to be performed by distributed.net, but if they make some catchy point and click (i.e. EASY to use) clients that anyone with a large following can use, we might see the end of such things as Slashdot subscriptions and a resurgence of the "community" feel of the web.

    --
    "Your superior intellect is no match for our puny weapons!"
    1. Re:Distributed hosting? by BovineOne · · Score: 3, Informative

      Our network already uses a somewhat distributed model to spread out bandwidth demand as best as we can. You can see a bit of it if you look at our Proxy Status page at http://n0cgi.distributed.net/rc5-proxyinfo.html

      Each of the servers listed are in in different DNS rotation grouped roughly by geographically named groups (that try to take in general network topology/connectivity). The servers listed there (known as "fullservers") handle all of the data communication needs requested by clients, and the fullservers in turn keep in contact with the "keymaster". The keymaster is the server responsible for the coordination of unique work between all of the fullservers and assigning out large regions of keyspace to the fullservers (which in turn split up the regions and redistribute to clients).

      The hardware that we had hosted at Insync/TexasNet was actually 3 machines which together served several roles: our keymaster, one of our dns secondaries, our irc network hub, one of our three web content mirrors, and our ftp software distribution mirror (for actual client downloads).

      It's unfortunate that the change in management at Insync/TexasNet caused them to want to re-evaluate all of the free-loading machines that were receiving donated services (there were apparently several others besides us) and cut off anyone who wasn't paying. Regardless, it's a touch economy and companies that want to survive have to look at where their costs are going and do their best to cut spending.

      --
      Don't waste those cycles! Put them to use! http://www.distributed.net/
  2. "I hate to /. them but..."? by flipflapflopflup · · Score: 5, Funny

    You've now got 10,000 readers hovering over the link, "Ooh, should I, shouldn't I?", then thinking f**k it and clicking anyway.

    A slow, painful, prolonged, /.'ing ;o)

  3. Stopping three quarters of the way by Soft · · Score: 3, Informative

    The RC5-64 challenge is currently at 73%, moving fast. Can you imagine the project shutting down just now?

  4. Re:Suggestion by hkhanna · · Score: 5, Informative

    No because the distributed.net client needs to communicate on it's own port in whatever internal protocol it uses. That's what causes the bandwidth usage, not the downloading of the client, if that's what you think.

    You can't put your own server software on sourceforge's servers, at least not to my knowledge, so all sourceforge would be good for is hosting the client downloads...which it might actually already do. Hope that answers your question.
    Hargun

    --

    Think nothing is impossible? Try slamming a revolving door.
  5. Multiple problems by Loki_1929 · · Score: 5, Insightful

    There are numerous things you just couldn't "distribute." The keys have to be served from somewhere, they must be tracked in real-time from somewhere, and they must be accepted/processed somewhere. Stats must be compiled and then put into a single database. To distribute this to multiple computers would cause the amount of bandwidth used to rise to an extreme level, far beyond what it is now. (ie. send out the info, let each node process it, receive the data from each node, hope to Christ it's right)

    Next, the integrity of the project gets called into question the moment you begin allowing clients to check processed blocks. The number of fals positives could easily shoot through the roof. Also, a computer with bad memory or simply running a faulty OS (such as Win9x/ME) could overlook a true positive, thereby virtually obliterating the project (ie. "we're at 100% completion with no result, guess we start over?")

    As stated above, stats would be impossible to do in this manner, and the same applies for key distrobution. One could argue that the total keys be distributed amoung thousands of nodes and handed out from there, but you create more problems then you solve. You still need a centralized management location to keep track of keys that have or have not been tested. Imagine a node going offline permanently or simply losing the keys it was handed. Suddenly, a large block of keys is missing. As it stands now, the keymaster simply re-issues the keys to someone else after a couple of weeks of no response from the client it sent the original blocks to. Under a distributed format, the keymaster would have to keep track of which keys went to which key distributor, which of those came back, which of those need to be redistributed, where they... (you get the message.)

    Next you run into another problem of integrity. What's to stop each distributed keymaster from claiming it's own client is the one that completed all blocks submitted to it. Consider this example, central keymaster sends out 200,000 blocks of keys to keymaster node 101. Keymaster node 101 distributes these keys to a bunch of clients which process the blocks, then send them back to keymaster node 101. Keymaster node 101, which has been modded slightly, then modifies each data block, changing the user id to that of the keymaster's owner, thereby making it appear that any block coming back from keymaster 101 was processed by keymaster 101. It might be easy to spot, but then how to you find out who to give credit to?

    The webpage doesn't attract the majority of the bandwidth; the projects do. Distributing the projects would be disasterous, as many have already tried taking advantage of the current system to increase their block yields through modded clients. Luckily, this is easy to spot for now. Under a distributed system, this would be next to impossible. All this, and I've yet to make mention of the fact that the code would have to be completely re-written to work alongside a custom P2P application, which would add months of development to a project that probably only has weeks or months left in it.

    In short, someone host the damn thing, k? :)

    --
    -- "Government is the great fiction through which everybody endeavors to live at the expense of everybody else."
  6. Aliens, crypto or cancer - what's your choice? by crudeboy · · Score: 3, Interesting
    I think the use of spare cpu cycles is an excellent way to support science, but...
    For some time the only one around was seti@home which analyzes noise from space, I think, in search for alien lifeforms, then there's distributed.net doing crypto and math stuff, (correct me if I'm wrong). And then there's people like Intel running medical research in areas like cancer and alzheimer.

    I don't know about you, but to me medical research feels a somewhat more beneficial to humanity than search for aliens. Don't get me wrong, I'm not saying that the work done by seti and distributed isn't important or shouldn't be done, just that there's other research that might be more worthwhile supporting.

    That's just my opinion, but if you feel the same way, checkout this site.

    1. Re:Aliens, crypto or cancer - what's your choice? by Sircus · · Score: 4, Informative

      You're wrong, so I'll correct you :-)

      d.net was around a long time before SETI@home - I've personally been running the client since 1997. SETI@home launched on May 13, 1999 (though they were fundraising and doing development for a couple of years before that).

      I'm personally strongly interested in cryptography for various reasons, so d.net gets my processor time. I seem to recall various people have concerns about how exactly the cancer project will use the eventual data it collects - i.e. whether the products produced as a result of the project will be commercially exploited - they don't want companies just using this large distributed network to make a fast buck.

      --
      PenguiNet: the (shareware) Windows SSH client
    2. Re:Aliens, crypto or cancer - what's your choice? by Sircus · · Score: 5, Interesting

      I sell a commercial SSH client and dabble in cryptography as a hobby - so I guess I fall in to the first category. There are plenty of reasons to be interested in cryptography aside from the Ashcroft/FBI-mandated ones, though. My issue with the cancer stuff is that if these companies are going to make billions off some cure (and if they come up with a cure, they sure are), I'm of the opinion that *they* should be the ones putting the billions into the research, not costing my cycles/power. I wouldn't give my facilities away to any other commercial venture for free, why should the situation change because they want to make money off cancer patients?

      If the distributed cancer network weren't there, and if it's really performing a genuinely useful job for the companies, you can be sure they'd be investing the $x million required to just buy a supercomputer or three to do it for them. So the only difference I see the cancer project making is that it's saving huge pharmaceutical firms a few million dollars. The world's cryptographers, most of whom are academics (ignoring the NSA-employed ones for a minute) don't have the millions of dollars to throw around if d.net wasn't there - neither do the mathematicians interested in the results of d.net's other project, Optimal Goulomb Rulers. As a result, I see d.net as making more of a difference than the cancer stuff.

      All that said, those are my reasons for running d.net - you've got your own reasons, and it's your own choice.

      --
      PenguiNet: the (shareware) Windows SSH client
    3. Re:Aliens, crypto or cancer - what's your choice? by mosch · · Score: 4, Insightful
      If you hold an interest in cryptography, then you should realize that d.net is an incredibly boring application. It does the cryptographic equivalent of proving that it's possible to count to a million, by ones. It's absolutely useless.

      If d.net did something interesting, like attempt to find an improved factoring algorithm, or to find a way to perform interesting analysis on ciphertext, then it would be useful. Right now though, it's a 100% useless application.

      Think for a moment about what d.net truly does, and tell me with a straight face that it's interesting to either a cryptologist or a cryptanalyst.

      If you want to help somebody with your spare cycles, you can help cure diseases or if you're so inclined, you can perform FFTs on random noise. Don't try to tell me that d.net helps anything though; you're kidding yourself if you think so.

  7. that's a lot of bandwidth by Trepidity · · Score: 4, Interesting

    A continuous three Megabits per second works out to somewhere just under a Terabyte a month. Not going to be cheap.

  8. Re:Dnet, is it useful ? by Graspee_Leemoor · · Score: 4, Informative

    "Cancer research? I've yet to see a viable distributed project for cancer research. By that, I mean an organized effort with real data, a complete and concise goal, and a clean method for reaching that goal. "

    http://members.ud.com/home.htm

    This is real research, worked on by United Devices, helped by the University of Oxford, Intel and the National Foundation for Cancer Research.

    It meets all your criteria- this is from their site:

    "The research centers on proteins that have been determined to be a possible target for cancer therapy. Through a process called "virtual screening", special analysis software will identify molecules that interact with these proteins, and will determine which of the molecular candidates has a high likelihood of being developed into a drug. The process is similar to finding the right key to open a special lock--by looking at millions upon millions of molecular keys."

    graspee

  9. Re:Dnet, is it useful ? by BovineOne · · Score: 4, Informative

    Because distributed.net is a purely volunteer project, many of its staff also have their paid day-time jobs working for United Devices (who are responsible for the THINK Cancer project). That includes myself, Nugget, Decibel, Moose, Moonwick

    --
    Don't waste those cycles! Put them to use! http://www.distributed.net/
  10. Re:Suggestion by BovineOne · · Score: 4, Informative
    Finding new hosting for our central "keymaster" is what the issue is. We have enough "fullsevers" for serving the computational data to clients (See http://n0cgi.distributed.net/rc5-proxyinfo.html).

    FWIW, Our clients actually can speak a pure HTTP protocol for requesting data, allowing a simple /cgi-bin/rc5.cgi script handle direct serving, but the default communications mode is a more compact raw binary mode.

    --
    Don't waste those cycles! Put them to use! http://www.distributed.net/
  11. Re:Distributed viruses? by BovineOne · · Score: 3, Informative

    Client downloads are PGP signed http://http.distributed.net/pub/dcti/v2.8015/ and are served by machines that mirror it (via rsync over ssh) from a tightly controlled host, which is not one of the servers that actually publicly serve the files. Although the binaries are pre-compiled, the original source code is open for review at http://www.distributed.net/source/

    --
    Don't waste those cycles! Put them to use! http://www.distributed.net/
  12. Re:So 3Mb/s huh? by BovineOne · · Score: 4, Informative

    That figure is actually closer to the current average peak. We in fact currently have an ipfw bandwidth limit on the machine to limit it to 3Mbit/sec and it mostly stays under it. We just over-quoted that figure a little bit in our announcement so that there would be fewer concerns over some marginal potential growth and try to factor in some of the bandwidth peaks.

    --
    Don't waste those cycles! Put them to use! http://www.distributed.net/
  13. Re:keyservers? by BovineOne · · Score: 3, Informative
    Running our personal proxy for large teams (particularly if they are all at a single corporation or a single school) can indeed help, because it reduces some of the overhead of communications with each individual client. There is also some optimization done by the personal proxy to allow it to request larger blocks of work and partition it into smaller portions when it finally distributes to the actual clients.

    However, this doesn't reduce the bandwidth at the keymaster any further, since this sort of splitting is already also being done at a larger scale between the keymaster and fullservers (and the bandwidth issue is with the keymaster, not the fullservers).

    --
    Don't waste those cycles! Put them to use! http://www.distributed.net/
  14. Re:Issues Resolved? by BovineOne · · Score: 5, Informative

    Although United Devices is currently graciously hosting some of the displaced distributed.net hardware temporarily, they've indicated that they are not willing to do this long term (which is quite a reasonable decision, since it is a lot of bandwidth).

    Note that several of the distributed.net volunteer staff (including myself) do indeed work for United Devices during the day, and that our employment there began awhile ago (more than 15 months ago), so that partnership announcement is not really related.

    --
    Don't waste those cycles! Put them to use! http://www.distributed.net/
  15. Someone please explain... by karlm · · Score: 3, Insightful
    Finally I've got a good excuse for not carefully reading the article :-)

    Thier site is popular enoug that it would seem to be a good time to experiment with moving the http stuff to freenet, since it's only updated once per day. The people willing to download the dnet client are would seem to be some of the most willing people to download the freenet client. Freenet is designed so that the slashdot effect actually increases reliability and speed of acess for the commonly requested data. Distributed.net would seem to have reached a critical mass of readership in order to have reasonable reliability for its freenet page. Your could have the client get your team and individual scores sent to it as part of the block submission cinfirmation.

    It would seem to me that they could arbitrarily reduce their bandwidth requirements by increasing the minimum size of keyspace portions they're handing out. It would seem that thier project traffic would be (or could be made) the same for each work unit, regardless of the size of the work units. Bigger work units are really only a problem for clients that are turned off and on regularly. They client still only needs to keep track of current state (current key in the case of RC5), the final state of the work unit (last key to check for RC5) and the current checksum for the work unit. None of these change in memory requirements as you increase work unit sizes. 99% of the people don't know the work unit size anyway, so changing the work unit size won't cause many people to complain, particularly if it's necessary to keep dnet hosted.

    Unless I'm mistaken, the server really only needs to send the client a brief prefix identifying the message as a work unit, followed by "start" and "stop" points for the computation. For RC5, this would mean a 64-bit starting key and a 64-bit ending key. I haven't sat down and worked out the cannocalization scheme for GRs, but it seems that they are countable (in the combinatorics sense, not the kindergarten sense) and could be represented fairly compactly. The current minimum ruler length need not be sent, snce you'd probably always want the client to send back the minimum ruler length in it' work unit anyway. The client would need to send back a work unit identifier (this could be left out, but it's not strictly safe) and an MD5 sum of all of the computational results or some other way to compare results when they duplicate work units. (A certain percentage of the work units are actually sent tomultiple clients in order to check that everyone is playing fairly.)

    --
    Copyright Violation:"theft, piracy"::Anti-Trust Violation:"thermonuclear price terrorism"<-Overly dramatic language.
  16. Re:Dnet, is it useful ? by Sc00ter · · Score: 3, Insightful
    dnet cracks keys by brute force.. Here's 10 keys, try them, oh? they don't work, here, have 10 more? They don't work either? Damn, have some more.

    It does that with a ton of people until it finds the right key. It will eventually crack every crypto they throw at it, because it's only a matter of time.


    Seti@home is searching for something that they don't even know if it's out there, and can you imagine the impact if they do find PROOF that there's life somewhere else? That's far more important then stupid crypto keys and such


    the UD cancer treatment, while iffy because it's probably set up to benifit a company still has a HUGE impact on EVERYONE'S life.. I don't know anybody that either hasn't had cancer or a family member that has had cancer, and to find a cure!

  17. distributed.net was useful in the past by athmanb · · Score: 5, Interesting

    By proving that RC5-56 can be broken by simple home PCs (with an algorithm as simple as you call it "counting to a million by ones", they IMHO did a large part to educate lawmakers that the age old U.S. export restrictions have to be overturned.
    And they succeeded in this.

    What I however don't understand is why they kept doing their cryptography projects afterwards. Proving that RC5-64 is breakable while you can buy 256 bit encryption freely is indeed just a stupid waste of CPU cycles and bandwidth.

    I'd like to see them discontinue RC5-64, and concentrate their work on OGR and maybe on other, new projects.

  18. Who cares? by athmanb · · Score: 3, Interesting

    Honestly.

    We all know that eventually, the key is going to be found, and some stupid message will be deciphered ("Congratulations on solving the 64 bit challenge. blablabla")

    Why waste trillions of CPU cycles and thousands of $ in bandwidth to find something out that we already know is true?

  19. Sorry, my CPU time is taken. by Fourier · · Score: 3, Funny

    You know, I would help out with all this distributed computing stuff, but my spare CPU cycles are all taken up running multiple instances of Progress Quest.