Slashdot Mirror


Reputation System Fights P2P Junk

yeejiun writes "Many of the files that are shared on p2p networks tend to be junk. Organizations such as the RIAA and music labels regularly pollute these networks with nonsense files masquerading as real music/video files. These junk files make it difficult for users to find what they want on such p2p networks. Some researchers at Cornell University have developed a reputation system called Credence, that works on the Gnutella network, allowing users to tell the good files from the bad ones."

8 of 338 comments (clear)

  1. I'm a little lost in this whole thing by ReformedExCon · · Score: 5, Funny

    I thought the primary purpose of P2P filesharing was to share legally swappable media files as well as other files like documents and useful freeware applications. Is there some nefarious entity flooding the P2P networks with garbage disguised as those files above? Why would you need to know the quality of the file's reputation?

    --
    Jesus saved me from my past. He can save you as well.
  2. eDonkey by mnemonic_ · · Score: 5, Informative

    Doesn't the eDonkey2000 network already have a system like this? Users identify fakes and report them, then the phony file information propagates throughout the network and the fake file dies.

    1. Re:eDonkey by daikokatana · · Score: 5, Interesting
      Indeed - but there is a big problem with that system. eMule recognizes the file hashes and reports them as fakse, but it stops after that.

      For the past few weeks, I have been rewriting part of the eMule source to have the following changes:

      1. I offer a valid file with a valid hash (no fake) 2. People try to download the file from me and move up fast in my queue 3. Once they download a chunk from me, the data I send them is invalid (generated random) 4. Since this part is invalid, they need to redownload it 5. Since they move up faster in my queue than others, they redownload the part from me. 6. etcetera...

      To be honest - I want to sell this tactic, that's why I do it. And so far it works! I get loads and loads of requests and rerequests for files, so this is a perfect tactic to kill the download of valid files - reputation system or no reputation system.

      Remember, the file is valid, but they'll get it much much slower and spend x times the bandwidth to get it. I have unlimited bandwidth (up/down) so I always win in the end.

      If whatever organisation I sell it to employs this on a large scale, the network will be flooded.

      --
      http://jcsnippets.atspace.com/ - a collection of Java & C# snippets
  3. Its not all bad... by distantbody · · Score: 5, Funny

    The fact that I didnt get to play HL2 was compensated by the 2 hours of dwarf porn.

  4. rtfa, sucka. by knowles420 · · Score: 5, Informative

    7. Can a group of spammers game the Credence algorithm by voting thumbs-up for each others' spam ?

    No. The trustworthiness computation is designed to preclude such attacks.

    8. What happens when a large number of spammers vote each others' spam up ? Can they fool the reputation system ?

    No. Credence's reputation computation is similar to Google's PageRank, but is more general - every node computes a different rank based on its own votes. Reputation flows from a given good node along trust edges towards other nodes. Spammers can create tight cliques in which everyone votes on each others' spam, but the entire clique will be deemed untrustworthy. And if anyone in the spammer clique does a search, they will see each others' spam ranked high.

    or, just do whatever you want.
    --
    -knowles
    1. Re:rtfa, sucka. by PylonHead · · Score: 5, Informative

      No, the pot smoker is right. Your brain is too small to absorb their goodness.

      In their system there is no single "high reputation" metric. Everyone had a different reputation to each other. Three people, A, B and C. A may have a high reputation as far as B is concerned, but C thinks A has a low reputation.

      They do this by grouping people who vote the same way. So you trust the people that vote like you do.

      Assuming that you vote good files up and bad files down, you will be grouped with people who do the same. At some point, the spammers have to start voting differently than you do.. voting their spam up. This will distance them from your trust network, and cause you to value their opinion less.

      --
      # (/.);;
      - : float -> float -> float =
  5. Re:One problem with this Credence system: by Anonymous Coward · · Score: 5, Insightful

    I think the main insight and contribution of the system is that the reputation of a peer according to you is determined by whether he/she votes in a similar manner as you.

    So if the RIAA starts spamming Gnutella with lots of junk stuff, you will never vote in the same way as the RIAA dummy accounts, and you don't take their votes into account.

    In fact, it seems the system is even smarter than that - it can take votes from people that are strongly uncorrelated with you and use that as negative information. So anything these people vote as valid files, you can treat as garbage as their definition of good/bad files is completely opposite to yours. And assuming you trust your own judgement, that means those files must be bogus.

    Reminds me a lot of the google pagerank system, but with explicit learning/training instead of using back-links for determining correlation.

  6. Huh by TCM · · Score: 5, Insightful

    Who actually searches for files in the P2P client? Normally you visit some site where the releaser himself posted a torrent or an ed2k link and you download that.

    I can't remember the last time I actually searched in eMule.

    --
    Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6