Slashdot Mirror


Reputation System Fights P2P Junk

yeejiun writes "Many of the files that are shared on p2p networks tend to be junk. Organizations such as the RIAA and music labels regularly pollute these networks with nonsense files masquerading as real music/video files. These junk files make it difficult for users to find what they want on such p2p networks. Some researchers at Cornell University have developed a reputation system called Credence, that works on the Gnutella network, allowing users to tell the good files from the bad ones."

17 of 338 comments (clear)

  1. better answer by eight+and+a+quarter · · Score: 4, Insightful

    quit downloading crap off of kazaa/grokster/morpheous/etc. dont trust brittneyspearsporno.avi.mpeg.exe

    --
    lameness filter thwarted.
    1. Re:better answer by zaxios · · Score: 4, Funny

      brittneyspearsporno.avi.mpeg.exe
       
      Link please.

  2. I'm a little lost in this whole thing by ReformedExCon · · Score: 5, Funny

    I thought the primary purpose of P2P filesharing was to share legally swappable media files as well as other files like documents and useful freeware applications. Is there some nefarious entity flooding the P2P networks with garbage disguised as those files above? Why would you need to know the quality of the file's reputation?

    --
    Jesus saved me from my past. He can save you as well.
  3. eDonkey by mnemonic_ · · Score: 5, Informative

    Doesn't the eDonkey2000 network already have a system like this? Users identify fakes and report them, then the phony file information propagates throughout the network and the fake file dies.

    1. Re:eDonkey by mnemonic_ · · Score: 4, Informative

      Ah, found it: donkey-fakes. eMule automatically downloads the fakes list upon startup, and prevents the files from spreading.

    2. Re:eDonkey by daikokatana · · Score: 5, Interesting
      Indeed - but there is a big problem with that system. eMule recognizes the file hashes and reports them as fakse, but it stops after that.

      For the past few weeks, I have been rewriting part of the eMule source to have the following changes:

      1. I offer a valid file with a valid hash (no fake) 2. People try to download the file from me and move up fast in my queue 3. Once they download a chunk from me, the data I send them is invalid (generated random) 4. Since this part is invalid, they need to redownload it 5. Since they move up faster in my queue than others, they redownload the part from me. 6. etcetera...

      To be honest - I want to sell this tactic, that's why I do it. And so far it works! I get loads and loads of requests and rerequests for files, so this is a perfect tactic to kill the download of valid files - reputation system or no reputation system.

      Remember, the file is valid, but they'll get it much much slower and spend x times the bandwidth to get it. I have unlimited bandwidth (up/down) so I always win in the end.

      If whatever organisation I sell it to employs this on a large scale, the network will be flooded.

      --
      http://jcsnippets.atspace.com/ - a collection of Java & C# snippets
  4. Here's a simpler idea... by lightspawn · · Score: 4, Insightful

    If a file appears to by RIAA-affiliated music, treat it as a junk file.

    Why bother with music the artist doesn't want you to have? Just forget about it altogether and discover new music, even new types of music that you'd never realize existed, much less that you could enjoy.

  5. Its not all bad... by distantbody · · Score: 5, Funny

    The fact that I didnt get to play HL2 was compensated by the 2 hours of dwarf porn.

  6. rtfa, sucka. by knowles420 · · Score: 5, Informative

    7. Can a group of spammers game the Credence algorithm by voting thumbs-up for each others' spam ?

    No. The trustworthiness computation is designed to preclude such attacks.

    8. What happens when a large number of spammers vote each others' spam up ? Can they fool the reputation system ?

    No. Credence's reputation computation is similar to Google's PageRank, but is more general - every node computes a different rank based on its own votes. Reputation flows from a given good node along trust edges towards other nodes. Spammers can create tight cliques in which everyone votes on each others' spam, but the entire clique will be deemed untrustworthy. And if anyone in the spammer clique does a search, they will see each others' spam ranked high.

    or, just do whatever you want.
    --
    -knowles
    1. Re:rtfa, sucka. by PylonHead · · Score: 5, Informative

      No, the pot smoker is right. Your brain is too small to absorb their goodness.

      In their system there is no single "high reputation" metric. Everyone had a different reputation to each other. Three people, A, B and C. A may have a high reputation as far as B is concerned, but C thinks A has a low reputation.

      They do this by grouping people who vote the same way. So you trust the people that vote like you do.

      Assuming that you vote good files up and bad files down, you will be grouped with people who do the same. At some point, the spammers have to start voting differently than you do.. voting their spam up. This will distance them from your trust network, and cause you to value their opinion less.

      --
      # (/.);;
      - : float -> float -> float =
  7. Re:One problem with this Credence system: by Anonymous Coward · · Score: 5, Insightful

    I think the main insight and contribution of the system is that the reputation of a peer according to you is determined by whether he/she votes in a similar manner as you.

    So if the RIAA starts spamming Gnutella with lots of junk stuff, you will never vote in the same way as the RIAA dummy accounts, and you don't take their votes into account.

    In fact, it seems the system is even smarter than that - it can take votes from people that are strongly uncorrelated with you and use that as negative information. So anything these people vote as valid files, you can treat as garbage as their definition of good/bad files is completely opposite to yours. And assuming you trust your own judgement, that means those files must be bogus.

    Reminds me a lot of the google pagerank system, but with explicit learning/training instead of using back-links for determining correlation.

  8. Taking advantage of the hoarder mentality by hellfire · · Score: 4, Interesting

    Many hardcore file shares and hosters, dare I say most that would call themselves hardcore, are not in it for getting free content on demand when they want it. They are into collecting absolutely anything and everything they can get their hands on. In some collections, people wouldn't possibly, in their lifetimes,be able to listen to all the music or watch all those movies. But just the thought of having it makes many hoarders happy. And it's not even necessarily reputation amongst others. It could be in many cases, but not always. They just have to have it.

    What's my point? Well, this is the greatest strength and weakness of peer to peer. Hoarders ensure a healthy flow of files, but they rarely actually check what they have. They don't check to see the software works, or if the music is a complete copy, or that the movie was cut down to a quarter of the original screen size.

    This is what companies take advantage of, both those who want to hurt swapping, and those who just want to seed files for the purpose of installing some evil spyware. It's nice to have a bunch of people trying to seed the masses but cmon the point of file sharing is to pool our independent resources. For someone who doesn't have all day to search for files and test quality and whatnot, it is sometimes less painful to just go buy the CD than it is to actually try to download it amongst the mess of files that are out there.

    --

    "All great wisdom is contained in .signature files"

  9. Huh by TCM · · Score: 5, Insightful

    Who actually searches for files in the P2P client? Normally you visit some site where the releaser himself posted a torrent or an ed2k link and you download that.

    I can't remember the last time I actually searched in eMule.

    --
    Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
  10. Litigation index by xixax · · Score: 4, Interesting

    Can this also be used as a metric for the RIAA and MPAA to decide which people to take legal action against? Go for the most trusted, most highly rated individuals and take out the most influential (central? critical?) nodes. In the same way that cliques of poisoners would stand out.

    Xix.

    --
    "Everything is adjustable, provided you have the right tools"
  11. Why is that AC post modded "Troll"? by Travoltus · · Score: 4, Interesting

    I disagree that these scientists are breaking any *legitimate* law, but if you accept as a premise that they are, then they are in fact breaking the law using taxpayer dollars.

    Instead of modding that down it should be modded up so more people can discuss the ramifications.

    Do we allow taxpayer dollars to be spent on civil disobedience? On that issue, I am very unsure.

    --
    --- Grow a pair, liberals... stop letting the Republicans bully you!
  12. Re:Self-policing is needed by EvanED · · Score: 4, Insightful

    But what the parent is saying (and which is a very legit argument if you ask me) is that if you're looking for a Debian repository, you're almost certainly not going to find a fake file!

    If you want to be sure, you can compare the file size to the official one. If it matches, you can be all but completely confidant that it's real.

    After all, there are probably far fewer people trying to flood P2P with bogus files just for the hell of it then there are trying to flood P2P with bogus files in an attempt to protect copyright.

  13. Re:Torrents can be bogus too. by Spudds · · Score: 4, Insightful

    And I don't see why they'd bother, when a threatening letter is all it usually takes to take a torrent site down

    That's not really true. Depending on where the site is hosted, legal threats could be more humerous than scarry.

        Case in point.

          Btw, if you've got a few minutes to kill, you should really check out some of the emails to and responses from thepiratebay.com. They are hilarious!