Slashdot Mirror


SETI's Anti-Cheating Strategy

mtDNA writes: "There's an article in the New York Times about the strategies SETI is using to avoid fraudulent reports. One trick they're using is multiple analyses of the same data. Another strategy is the use of "ringer" data, where they send you fake data for which they know the results." One of the researchers has several postscript papers on his home page - Incentives for Sharing in Peer-to-Peer Networks, Uncheatable Distributed Computations, Distributed Computing with Payout. In related news, ProcessTree apparently sent out an email to participants indicating it is closing up shop, so although SETI seems to be chugging along, the idea of distributed computing as a business model is perhaps a bit premature.

48 of 108 comments (clear)

  1. Re:Active punishment? by jd · · Score: 2
    I suggest coding up some of William Gibson's "Black Ice", using confirmed cheaters as test subjects.

    Solve the cheating problem =AND= the population crisis at the same time.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  2. To prevent cheating... by jd · · Score: 4
    ...You simply remove any incentive to cheat.

    William Gibson's "Black Ice" should do nicely. Failing that, slice or dice the data in multiple directions and compare results.

    (The "different slices" is important, to ensure that you aren't trying to validate one modified client against another.)

    Let's say that you have a grid of data, N x M x B (where N, M is the data, and B is the number of bits per word for that data.)

    The probability that one modified client is doing the rounds, and will be encountered again by chance, is non-zero. It's not high, but it's high enough that nobody is releasing their client code in a hurry.

    On the other hand, you've three simple slices you can do (along each axis), and any number of more complicated ones. That means that you have to hit the correctly-modified client for the slice you've picked, for each slice in each axis, for the data to be marked "valid". Any failure by any one client to return a result that confirms the other 16 clients that would overlap with it, would signal a bogus client.

    With that much redundancy, you could also simply have "client voting". The results that are returned identically by the most clients (in excess of some threshold), regardless of the direction of slice, could be regarded as "true", with a reasonable degree of certainty. (Sure, it's not 100%, but that's the price you pay for having a society that rewards the greedy and the ethically sick.)

    Of course, if you want to go one stage further, there's nothing to stop you "dicing" the data. Instead of taking a single slice through the data, you take random, small chunks from all sections, and feed them in a random order to the client. Again, the server re-constitutes the "valid" results, by merging together the results from multiple clients, taking the generally-accepted results as "correct".

    This would mean that, instead of needing 20+ clients, all with suitable code for cheating "correctly" along each slice, you now need !(N x M x B)/(Size of chunks) such clients. The values don't have to be large to make this a virtual impossibility.

    If you then only credit "confirmed" units (whether "slices" or "chunks"), since cheating becomes impractical, short of a global Internet conspiracy which also included the researchers, nobody is going to bother modifying the clients in any way which produced inaccurate results.

    They =MIGHT= modify them to produce faster, accurate results. But, in that case, who bloody cares? I'm not going to object to someone handing round an honest, genuine client that can plow through 10 times as many blocks in a second, and still deliver the true results back to the central system. And, if the scientists were being honest to themselves, I doubt they would, either. PROVIDED the results could be guaranteed.

    And that gets back to why independent result reviews, using slicing, dicing, or some other method of producing non-duplicate data sets, is very important.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  3. They believe so by Pseudonymus+Bosch · · Score: 2

    I can see it now, some geek going up to a girl to impress her with his falsified SETI numbers).

    Somebody who believes in extraterrestrial intelligences can believe in SETI-impressionable girls.
    __

    --
    __
    Men with no respect for life must never be allowed to control the ultimate instruments of death.
    GW Bu
  4. Processtree closing down. Where is your user info? by simpleguy · · Score: 4
    Distributed Science Newsletter
    May 2001

    Dear ProcessTree Network suppliers,
    It is with sadness that I have to announce that this will be the last newsletter you receive from Distributed Science, Inc.

    etc etc etc...

    We will diligently negotiate the sale of the supplier database, with emphasis on the privacy policy under which you signed up. As soon as we came to a result, the new owners will be informing you about any changes they might plan, including an opt-out for those concerned about their privacy under new management.

    EEP!

  5. An agorithmic solution by XNormal · · Score: 2

    Having a closed source client is not the solution. Cryptography is the solution. Here's how:

    1. For each quantum of the distribution calculation in the range you have been assigned store one or more bits of evidence for the result.

    2. Calculate a Merckle hash tree of this evidence vector

    3. Use cryptographic hashes of the tree root to "randomly" select 64 leaves of the tree

    4. Transmit the branches leading to these leaves as proof that you have performed the full calculations

    To verify, the server verifies the hash chains of the branches, the randomly selected challenges and verifies the evidence for the selected leaves by repeating the calculation for a very small subset (64) of the assigned range.
    You cannot create this evidence without performing virtually all of the calculation assigned to you.

    You can still cheat by finding the solution and not reporting it, but there is no incentive to do this.

    -

    --
    Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
  6. Re:Processtree closing down. Where is your user in by bughunter · · Score: 2
    I received ProcessTree's email yesterday, and when I opened it, it was nothing but an html attachment.

    So I trashed it.

    If someone doesn't have the courtesy to put at least a "please read the attached letter for a very important announcement" in the plaintext portion of an email, I don't read it. Assuming we all use either a Microsoft or a Netscape client for our email belies some kind of ignorance or arrogance, or both.

    And those qualities are also probably also the reason they're failing.

    --
    I can see the fnords!
  7. Re:Already a business model by Lumpy · · Score: 3

    Umm no you are actually quite wrong.

    Render programs are free. (povray for example, many many Excellent CG films have came out of povray. Just check the Intertnational Raytracing Competition pages)

    Yes some render programs cost exorberant and insane prices, but places like pixar have programmers that write the software, and most good animation houses have their own programmers, so your cost per copy goes from $30,000.00 from the development of the first one to $0.00 for every copy thereafter. (dont give me any crap that there is a cost associated with the copies afterwards, that is pure bullcocky)

    Do you think that lucasfilms goes to "CG-R_US" and buys a new effect? nooo, they create it, and then they can use it on 94,999 computers for free.

    CG is cheap, and distributed processing (possible in POVRAY for a really long time now) is also cheap.

    --
    Do not look at laser with remaining good eye.
  8. publicize "cheaters list" by peter303 · · Score: 2
    If people are cheating to gain fame, then the fear of of being ridiculed in public after they are caught cheating would probably be effective.

    I wouldn't recommend doing this. In practice, negativity and bad will, even when justified, often backfires injuring the issuer.

  9. hmmm, just like I've been saying all along by ethereal · · Score: 4

    Their argument against open-sourcing the client has always been that this would allow cheaters and that people would use modified clients that didn't crunch the numbers right. To which I have always responded that with any distributed computational task running on untrusted clients, you would have to do this sort of redundant analysis on each data block anyway. Even a closed-source client can be hacked fairly easily if you really wanted to, so not releasing the source doesn't magically guarantee the validity of any client-side processing. It's nice to see SETI@Home finally acknowledge what some of us have known all along.

    So, when will we be seeing the client source code available for download? I'm all ready to start working on an Xscreensaver module for it.

    Caution: contents may be quarrelsome and meticulous!

    --

    Your right to not believe: Americans United for Separation of Church and

    1. Re:hmmm, just like I've been saying all along by imipak · · Score: 2
      The closed-source SETI@home client *was* hacked, several times, by people trying to get faster performance (not outright cheating)... the SETI folks went pretty ballistic when they found out, sadly they had to explain in very short words that science is about repeatable experiments, and that requires that all the data is processed in *exactly* the same way. Otherwise the parameter-space searched by the programme would be subtly skewed - for example, a faster algorithm might mean that signals at the far end of the gaussian spectrum they're looking at would be missed or included for only the hacked clients.

      However there are tons of unofficial add-ons that *are* allowed: see here at the SETI@home site.

      This and much more info in the unofficial SETI FAQ... infuritatingly, I've got a copy saved at home but can't find a link to it anywhere. (Think this was the Usenet FAQ.) Anyone?
      --
      "I'm not downloaded, I'm just loaded and down"

    2. Re:hmmm, just like I've been saying all along by imipak · · Score: 2

      Got it: the other SETI FAQ.
      --
      "I'm not downloaded, I'm just loaded and down"

  10. More Distributed Projects by rinkjustice · · Score: 2
    If you've got cpu cycles to burn, why not use it on a worthwhile project like Genome@home, which strives to improve understanding on the evolution of natural genomes and how they operate. They even have a proven track record. There is also Popular Power, which continues working towards a more effective influenza vaccine even though they're out of business.

    A listing of notable distributed computing projects are here - (http://www.hardcorelinux.com/distributed-computin g.htm for all you goatse.cx traumatized).

    come off crisp and play up to the cynic
    clean and schooled right down to the minute

  11. Re:Why? - other cheating alternatives by Brento · · Score: 2

    My questions is Why anyone would want to cheat SETI?

    Yeah, especially when there's the new shared IBM mainframe coming out, where anybody can install programs. That's going to be the biggest use of it - a bunch of l33t h4x0rs installing various Distributed.net clients on it, all trying to add more power to their results. Whoop-dee-doo.

    --
    What's your damage, Heather?
  12. Re:Double Resources by MrNixon · · Score: 2
    Two points:

    1. SETI can't afford to buy some massive 'big iron' to get the performance that they get (essentially for free) from SETI@home.

    2. The way that SETI@home has been ripping through the data packets, they were going to run out of data to send to the clients very soon (like sometime next year). Any way that they can slow down the process (while increasing thoroughness and reliability) is welcome.

    Oh - and SETI@home only uses 1 telescope (not even a satellite) to do it's work: the Radio Telescope at the Arecibo Radio Observatory in Puerto Rico (the big satellite dish built into the mountain that was in the James Bond movie) - the largest single satellite in the world.

  13. What about public ridicule? by revscat · · Score: 3

    I have an idea for how to at least reduce the amount of cheating going on with SETI: ridicule. Because let's face it if you cheat at SETI you deserve ridicule. You're a worthless mess of a human being who probably hasn't been laid in, I dunno, EVER and has to inflate their self-esteem by turning a quest for Contact into a bigger dick contest. No one respects you. Kill yourself and leave your computer running. Your computer is worth more to society than you are.

    Grr. I'm way too high strung today. Where's the bong? But godDAMN people are so freaking simple minded sometimes! What do you gain by cheating at SETI? Higher rankings? So fucking what! Great, now instead of being ranked 39623 your at 32532. RaH. You're my hero. The world is a better place because you cheated. You've fed the hungry and increased our collective wisdom. L0s3r.

    Dump core. And pass the bong.

    - Rev.
  14. speaking of distributed.net... by CoughDropAddict · · Score: 2

    ...if you're not running the client, do. If you are running the client and you're not affiliated with any other team, please join team Slashdot.org, if for no other reason than to spite these twits, who are ahead in daily counts these days (from their team page: "The best people. The best effort. The best platform. RC5 will fall again....")

    --

  15. distributed.net does the same by mattvd · · Score: 4

    As far as I know this is nothing new, distirbuted.net has always done this on thier projects (RC5, DES) to make sure people are actually checking the blocks.

    1. Re:distributed.net does the same by carleton · · Score: 2

      SETI has, at least as far as I understand it, one advantage with respect to cheaters versus RC5, DES, et al. With SETI, one would hope that there would be more than just one signal of intelligence being captured, so that if someone cheats, claims to have searched a region containing an intelligent signal, but does not, there will still be other intelligent signals to be found elsewhere. In contrast, with DES or RC5, there is only one needle in each haystack, and if a cheater happens to claim the section where the needle actually is, no one will ever find the needle (well, at least until they check the rest of the pile, and then start rechecking sections.)

  16. I'm a processtree participant ... by Dwonis · · Score: 2

    ... and I received no such email.
    ------

  17. Re:Already a business model by rogerbo · · Score: 2

    Actually you're half right.

    Pixar doesn't have to pay per license because
    they wrote Renderman so they get it for free.
    But POVRAY? Please, this hasn't been used
    on any films that I know of (and yes I work
    in the film visual effects industry).

    There is a free version of Renderman called
    BMRT (www.bmrt.org) but many many
    visual effects companies do pay $10,000
    US per copy for Renderman render licenses
    or slightly less than that for Maya or
    Mental Ray render licenses.

  18. Attention Team Slashdot ! Let's Climb SETI Ranks ! by cybrpnk · · Score: 2

    For those of you who don't know, there is a SETI@home team composed of Slashdot netizens here. There are currently almost 2200 members in Team Slashdot that have contributed 700,000+ work units to the SETI@home project, for a team rank of 17th. Teams from HP, IBM, Microsoft, Intel, Compaq and Sun are ahead of us! Personally, I'd like to see Team Slashdot show these slackers a thing or two about what nerds can do. A little effort by an individual goes a long way in Team Slashdot. I've got SETI@Home running full time on a crappy little Pentium computer that has churned out only 35+ units and has taken almost a year to do it, and I've still contributed more units than half the Team Slashdot members. I'm gonna upgrade my input to SETI@home. Join me! Let's get a discussion / confessional / pep rally going here about what we can do to upgrade the Team Slashdot effort for what we all agree is a worthy cause!

  19. Re:The reason people are cheating. by Crixus · · Score: 2
    You can't say though, that most people aren't signing up for the novelty of being in the race. if it weren't for the stats, I wouldn't be participating at all, and neither would 90% of their userbase. I'm contributing for the good of the Halo Seti Marines, and damned proud of it.Get rid of scorekeeping, you get rid of the major motivation.

    What you really can't say is that 90% of the people are in it because of the stats. That wouldn't be allowed in a court of law, and won't be allowed here.

    I still say get rid of them. Competition brings out the WORST in people, not the best, as evidenced by the cheaters who hacked their clients to download work units, and immediately (after NO analysis) send back a blank results file. These people were "crunching" thousands of units per day and really stinking things up.

    If SETI loses any people from having no stats, I can assure you they won't be missed.

    Rich...

    --
    Ignore Alien Orders
  20. The reason people are cheating. by Crixus · · Score: 3
    The reason people are cheating is because they decided to make a contest out of who procressed more workunits.

    I for one wish they would get rid of the scorekeeping entirely. I crunch SETI units because I enjoy the idea of helping them with their science.

    Any users they lose because they were to get rid of scorekeeping would be no great loss. They were probably the losers who were compromising the datapool anyway. (talk about having no self esteem, I can see it now, some geek going up to a girl to impress her with his falsified SETI numbers).

    I was one of the first 10,000 people to sign up, and I'll help them with their science as loing as they need me to, scorekeeping or no.

    Rich...

    --
    Ignore Alien Orders
  21. Men In Black? by Dr_Cheeks · · Score: 2
    How do we know that the 1% of data that's not been processed didn't contain proof of ET? We don't. And you know why? It's a government conspiracy, that's why! And they're out to get anyone who lets the secret out!

    [sniff, sniff]Hey, what's that funny smell? Urrrggh, eyelids....heavy....soooo sleeepyyyy.....

    --

  22. Don't forget Juno by ruebarb · · Score: 2

    The "you use our email, so we're stealing your spare processing power and renting it out and you agreed to this with a click-thru agreement so screw you" company...distributed computing seems to be quite a bargain for them..

    --

    ----------
    ah honey, we're all resplendent - Bill Mallonee
  23. Why bother? by Animats · · Score: 2

    For something like distributed cryptanalysis or "SETI at home", it would work almost as well if, when a client asked for some work, it was given a random chunk, with no checking for whether it had been done before. This statistically doubles the computational load, but eliminates coordination problems. Yes, you might miss something, but it's unlikely.

    1. Re:Why bother? by TeknoHog · · Score: 3
      If someone reports a hit, cant they just re-check that data?

      They do. What the client programs do is something of a preliminary analysis, filtering the most interesting packets of data from the usual junk. In the further analysis it often turns out that lots of interesting signals originated on Earth, while many others are inconclusive.

      --
      I hit the karma cap, now do I gain enlightenment?

      --
      Escher was the first MC and Giger invented the HR department.
    2. Re:Why bother? by KarmaPolice · · Score: 3
      If someone reports a hit, cant they just re-check that data?
      You're missing the point with SETI. There is no such thing as "a hit" when analysing these massive amounts of data. Your computer will never give a message like: "Analysis detected a HOW ARE YOU GENTLEMTN, ALL YOUR BASE ARE BELONG TO US from outer space". What your computer does is just an analysis and then the SETI-folks will do the real exciting stuff with the resulting data from your computers work.

      The problem before SETI@Home was that the data wasn't analysed completely to detail because these analysis take a shitload of time so they just did a rough analysis, trying to find extreme peaks but no checking for patterns over longer periods of time.

  24. Known signals in the SETI system by yerricde · · Score: 2

    More than once I've got a clear signal that was obviously extra terrstrial in nature. The distribution was so far away from random noise that it had to be artificial. I run the data through the Seti program, and what does it come out with? Nothing.

    SETI@home beams known signals to the radio telescope as a check to make sure the whole system is still working properly and to call out clients that give false negatives. There are a few on constant frequencies; there are probably others on frequencies that change daily.

    --
    Will I retire or break 10K?
  25. Re:Active punishment? by stilwebm · · Score: 2

    SETI is less worried about punishing cheaters, and more worried about getting accurate results back from the clients. Without control over the data, the entire dataset would be skewed and would have a huge impact on its scientific value.

  26. About time by Stott · · Score: 2

    I always wondered how people were getting an average of 1 hour per work unit. Then I realize there's some good hackers and programmers out there!

    Anyone know where I can buy a seti card?

  27. no no no no by nomadic · · Score: 2

    the idea of distributed computing as a business model is perhaps a bit premature.

    Premature? Premature?! Of course it's not premature, it's about 30 years too late. Distributed computing used to be nice and profitable, but processors are just too cheap now for it to work. For large-scale, nonprofit efforts like SETI, sure, but if someone's actually going to pay to rent computer time, it would just be cheaper to buy the processors themselves. Or, if it was truly profitable to rent computer time, specialized computers with intel/amd clusters would pop up to provide it with less overhead.
    --

  28. Re:Active punishment? by Kingfox · · Score: 4

    God, I loved that old feature on Telegard/Renegade and the like. Though most people figured it out when noone responded to their flames, and then made a fake account/logged on as a guest, to find out the truth. But this would work with seti, where there is no 'feedback'. Hell, they've even disabled 'see my last 10 packets' as of late, so as long as they kept on incrementing the person's records to their eyes, it wouldn't matter. As far as the problem that you present - a broken computer being innocent as compared to malicious data. That really isn't a problem. Not to sound like an arrogant fuckwad, but the end result is the same to seti. Data that's just wrong as a result of a computer going tits-up or data that's wrong from a computer being messed with - it really doesn't matter. They're going to need to reject both.

  29. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  30. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  31. Seti are hiding the truth by 91degrees · · Score: 2
    More than once I've got a clear signal that was obviously extra terrstrial in nature. The distribution was so far away from random noise that it had to be artificial. I run the data through the Seti program, and what does it come out with? Nothing.

    I repeatedly try to get interest from the government over this, but they aren't interested. I mentioned it to the roman catholic church, and they were horrified. I think it mut interere with their religious dogma or something. I sent it to Carl Sagan. He mysteriously died.

    The truth is out there. They don't want you to hear it!

  32. I should have known by tenzig_112 · · Score: 4
    Somebody sent me something strange from his SETI at home setup. I don't know for sure, but it looked a little like a hoax to me.

    Here are some warning signs that you may have a SETI hoax on your hands:

    • a midi file of the
    • Close Encounters tones.
    • A .gif of Leonard Nemoy as Spock with the caption "Live long and ... whatever."
    • The astral baby from 2001 rendered in ASCII graphics.
    • "Hello, people of earth" in a voice that sounds suspiciously like Homer Simpson.
    • Anything resembling "Goatse.cx"

    In other news: Bi Curious: The Senator Jim Jeffords Story

  33. AH! by HongPong · · Score: 5

    But the problem is not ordinary punks hacking the client to create false positives. No, the problem are those Beowulf clusters in underground NSA facilities making all the false negatives!

    --

  34. Why? by clinko · · Score: 5

    My questions is Why anyone would want to cheat SETI? I could just see the guy now:

    "LOOK! i'm high on the hours list with 31337 years of data done on my computer for SETI. I RULE! Oh god, I wish I were dead..."


  35. Re:Active punishment? by telstar · · Score: 4

    Instead of locking out a cheater, a better solution is to continue to feed data to that cheater, but ignore any results they submit. This will help prevent the cheater from simply creating a new account, as they will be unaware that their false results have been detected.

  36. Double Resources by tenman · · Score: 3

    I admit that I am not terribly familure with seti, but I know that they use huge amounts of collective cpu time via the distribution of processing to remote processors. My comment relates to who decides how much of a performance hit do they want to take to insure accuracy. Do you send sheets of data out twice, and reduce your net performance by half? I don't understand how sending rouge data sheets will "catch" the bad guys, wouldn't the one that get caught just change their IP/User Name and start sending bad data again? I'm afraid that if SETI really wants security, put a bunch IBM Z series boxes straight to the satellites, and let little instances of Linux churn over the data.
    TEN

  37. Anonymity breeds cheating. by fmaxwell · · Score: 2

    What SETI needs to do is get verified e-mail addresses, real home phone numbers, real-world addresses, IP addresses, etc. and verify the data. If they catch a cheater, publish the information on the net, complain to the abuse department at the cheater's ISP, etc. They might even go so far as to launch a civil suit for damages, punitive and compensatory, against the cheater. Take those steps and cheating would be a fraction of what it is today.

    1. Re:Anonymity breeds cheating. by fmaxwell · · Score: 2
      There's nothing illegal about cheating SETI with fake results.

      So you think that vandalizing their data and experiment is legal?

      What are they gonna charge you with?

      There is a license that forbids connection to their servers with software other than the client they supply. That's a breach of contract. It's also a potential "trespass against chattels." There are simple charges like vandalism that can be made as well as charging the cheaters with violation of the Computer Fraud and Abuse Act.

      There are plenty of legal avenues that could be used.

    2. Re:Anonymity breeds cheating. by fmaxwell · · Score: 2
      If no money (or other consideration) changes hands, the contract is not valid. 2 outta 3 ain't bad, but it also ain't a contract.

      The opportunity to participate in the study, combined with the possibility of fame and fortune, is consideration. The compute time made available to the study is consideration in the other direction.

      So, there is offer, acceptance, and consideration.

  38. Just don't keep score. by Fizzlewhiff · · Score: 2

    I think as long as these sites keep stats and "score" individuals or teams you're going to get cheating. It's kind of sad when you think about what can come out of the SETI program and people are out there sending false data just to make them look good. This won't stop 100% of the cheating, but I believe not showing or ranking individual stats might cut it back some.

    --

    'Same speed C but faster'
  39. why kick bad users off? by cursion · · Score: 2
    Why bother kicking bad users off? Why bother to let them know they've been caught? Just start feeding them bogus or real data - and then ignore their results. Let them figure out they've been caught.

    John

    --
    remember when it was {of|for|by} the people?
  40. Shocking! by s20451 · · Score: 5

    So somebody's trying to manipulate the system in order to artificially inflate a meaningless number in a database! How shocking! (Score=5, Insightful)

    --
    Toronto-area transit rider? Rate your ride.
  41. This is why... by JanusZeal · · Score: 2

    ...we'll never find out if there is intelligent life out there or not.

    While SETI and NASA are jumping the gun and declaring a fake packet to be a sign of "intelligent life out there" and awarding some loser a lot of money for making the find, any real signals that for no apparent reason which are aimed specifically at us, won't be processed because the SETI@Home project will have achieved its goal.

    Now if only CounterStrike games could end so vividly.

    "Cheater detected, cheater wins, GAME OVER."

    Heh.