The Truth About SETI@Home
zealot writes "According to this article, the SETI@Home project is not using the most optimized clients available "just to brake the unit turn around" so that they can continue to recieve various contributions. The authors are also demanding access to the client source (and asking to GPL it if possible), so the greatest performance may be obtained. " It's an interesting point: They didn't figure on getting the reponse they did, and will sooner rather then later run out of blocks to be crunched. Yep: What happens if hold a war and /everyone/ comes? Or a distributed program, I guess.
All of the evidence points to exactly the problem that the author describes. Seti@home needs to write a second client to handle the division of work and "graduate" some of the reliable, high performing volunteers into producing the units of work for others.
Their fundamental problem is that they only distributed the analysis portion. Now that the overall load has become unbalanced, they need to distribute one more piece of the workload.
Geeky modern art T-shirts
I'm not saying that this is unbelievable, just that it would be nice to have some evidence to back these claims up... or else state them as conjecture, not fact.
--
Let's not get too wound up because SETI@HOME is getting overwhelmed. It means that the idea is successful. Why do people participate in these kind of distributed processing projects? With the exception of those who want to show off their machine, most of us do it because we feel that instead of our computer downloading Warez or porn all night, we can do something useful.
If SETI@HOME is having some troubles, helpful advice, not scathing criticism is what is needed.
I would much rather see more of this kind of thing, even if it was occasionally bungled, than other groups being scared off because of how hostile the online community is
Maybe I'll find ET...
-------------------------------END--COMMUNICATION
From the Yahoo page:
Looks like anyone interested can find out the real scoop from the horses mouth.
The article seemed to be flame bait to me. They never said that Seti@home said anything other detailing the performance critical routine in the seti@home software. Then the way I read it seti@home did not want to give up their source. The article said:
Is this what they said or more likely an interpertation of what they said?
Lets check the facts before slamming Seti@Home.
Check out the Lance Armstrong Foundation
kayaking
Oh damn, the Seti@HOME people are "making" me run my computer all night (at "full throtle", no less !
My opinion of his opinion: he should "get over it". The only thing driving that dude is competition with others, not the altruistic donation of *spare* computing power towards (an arguably) good cause.
If I ever write an article like that, remind me to switch to decaffinated coke.
-adam a
Look at the awful job most of the search engines are doing keeping up with the web. Why not a distributed spidering project? Hand out a base of URLs to spider, then let remotes spider from there. As the ever pessimistic Rob has already pointed out to me, the load on the host end would be huge, but I still think if it were done right, the whole net could be cataloged in a few months, then kept updated.
It seems like a distributed spidering project with a search engine front end like Google on different hardware/net could make a search engine useful again. There are probably a few interesting things that sites could do to streamline the workload--I haven't thought of them, but my spidie senses are tingling.
slashdot broke my sig
Seems to me the simple answer is to process all blocks twice and compare the results. This would then solve/detect the problem where a hacked client was sending back "untrue" results. Anything that comes back with "different" results gets sent to a third "validated" respondent and the differing on of the initial pair gets demoted from validated. This also solves the workload bottleneck for the time being.
Well, if they've got so many more volunteers than are strictly necessary, why not hand out blocks multiple times and check that all the clients give the same results? If you detect any differences, run that block on a trusted machine at SETI@Home HQ and ban the clients that returned the bogus blocks.
Granted, you aren't going to be able to detect hacked clients returning unchecked blocks very easily this way, because you won't have too many positive blocks to compare the results with. But you could seed the raw data with some known positive blocks to catch clients that are returning incorrect (unchecked) negative results. And if a hacked client is sophisticated enough to return a positive result for a positive block and a negative result for all other blocks, isn't that the same behavior as an unmodified client?
Yes, this extra redundancy would slow the project down, but it sounds like there is more than enough computing power available. If SETI@Home explained that redundant processing was necessary to ensure valid results, I'm sure most users wouldn't have a problem with it. If you're interested in SETI@Home in the first place, you already know that good science and/or good data analysis isn't done overnight and requires a lot of procedural safeguards to get the right results.
Your right to not believe: Americans United for Separation of Church and
It's just like Linux vs. BSD.. Each side has something they excel at, and something that they lag behind at. Just use whichever one makes you happy.
This is a situation where a hierarchical workload distribution would probably work. Unlike the SETI project with its huge, monolithic data chunks, a spidering project would be dealing with small (comparatively speaking) chunks. There could be several levels of capability depending on host speed, storage, bandwidth, etc. A company with Suns, a few Gig to spare, and a T3 could handle more volume and complexity--maybe spending their cycles figuring out relevance by context and links, etc., rather than spidering. A lot of the grunt work could be done before the cataloged data are returned to the destination hosts. This would also be a little nicer to bandwidth, I guess.
slashdot broke my sig