Ask Slashdot: Best Practices For Using a Reputation Engine To Rate Information?
GrantRobertson writes: For my graduate project, I am considering developing a web engine designed around sharing and organizing actual information in a way that people would actually like to and easily be able to use it. Unlike a wiki, the information will be much more granular with lots more metadata and organization. Unlike a web forum, the information will be be organized rather than dispersed throughout thousands of random posts, with little room for dominant personalities to take over. While I like Stack Overflow, I am planning far more structure. While I enjoy the entertaining tangents on Slashdot, I don't want those to take over sites created using my engine. Naturally, there must be some way to prevent armies of bots or just legions of jerks from derailing web sites created using this engine. Given that, what would you say are some good rules to include in the reputation engine for such a site. What kinds of algorithms have you found to be most beneficial to the propagation and spread of actual knowledge. What would you like to see and what have you found to be dismal failures?
you are counting on Slashdot to do your graduate project for you? That is a horrible idea in so many ways...
Pretty much any set of algos is going to be easily defeated by humans trolling and no system is going to be anything near perfect. My thoughts;
1) Create a small set of simple, concise rules that are inviolate
2) Have a system so people can mark submissions as good (no rules broken/useful) or bad(rules broken)
3) Have your referees do nothing but determine if that submission is breaking one of your rules
4) Based your user trust as a derivative as how the user voted compared to what the referee votes
The theory is any controversial submission is going to get flagged & referees attention. Their job is limited in scope to just determining if the post breaks the site rules or not, nothing to do with quality / content / opinion. If users are trying to game the system their votes are going to conflict with the referees so their user trust is going to go down, whereas if people agree their trust is going to go up.
Eventually you'll have a group of users that you can generally trust to do the right thing so you can weight their actions accordingly.
Obviously there are some weaknesses;
- Referees are pretty much god (that's why the scope of their power is extremely narrow and simple)
- You can end up with hive mind (though you can combat that if enough trusted users conflict with other trusted users). I'd argue it's a way better protection than pure crowdsourcing ala reddit where the demographics crush submissions into hivemind
Just tossing that out there off the top of my head. It's not something to replace automated reputation management, just something augment it and limit some of the abuse.
Should someone inform their thesis advisor that they are getting others to do his work for him?
Isn't the whole point of thesis work that you find some novel solution to a problem through your own research not enlisting others to do it for you?
Since you want Slashdot to do your graduate project for you. Save yourself the time and read up on "neural" networks, and deep learning. You'll learn how to bullshit anything after that.
"Unlike a wiki, the information will be much more granular with lots more metadata and organization."
Pretty sure the ideals behind a semantic web were supposed to cover this part. Never really took off though because, I think, people are to lazy to sort data to that degree of detail and the algorithms necessary to process and categorize human text with that level of granularity seem to be very hard to make.
very ambitious and it might be impossible without strong AI/s controlling it as humans can't be trusted to seek or even know the truth as a general population (assuming you can teach an AI the difference between good data and bad), you can't write an Algorithm for truth as "truth" is defined by perspective of the observer. (not all truth is truth depends on which side of the gate you're on)
The closest we as a community have come to "true / false" is by using large amounts of people to "rate" information, but sadly this is also open to Bias and manipulation or with rewards, this is true of all stored information in any form.
So what you need is a way to sort good data from bad data... quantity of supporting evidence is one way, but the can be manipulated by spamming, ranking can be manipulated by duping or botting, Mod voting (where one person gains more value to their support based on community input) can be bought or traded (*cough* wikipedia) or authoritative control (based on credentials) can be very biased based on the beliefs of the holder or their educational center. you need another way that doesn't exist yet.
It didn't make sense 50 years ago, it doesn't make sense today.
If any of these hammerheads could do that. they'd be raking in millions, and certainly not sharing what they know on SlashDot.
Let us know how it turns out.
Look into mTurk (Mechanical Turk). Amazon doesn't provide a reputation engine, but anyone who posts any significant number of jobs there has some kind of version of it. I worked for several years on a project that integrated with mTurk and had its own reputation engine. There are a lot of gotchas where people try to game the system. It isn't a simple answer and depending on the situation I don't believe there is a one solution for all situations. Bill
Don't.
the only reputation system that will ever beat legions of jerks will have to be able to determine if the information itself is correct. when dealing with jerks, you need to remember they are humans, the most cunning and devious of superpredators. jerks will build a good reputation by giving good answers just to destroy the reputations of others or build up reputation of jerks that give bad answers. no system you come up with will be infallible.
Anons need not reply. Questions end with a question mark.
...the power of jerks!
Artificial intelligence is no match for natural stupidity!
Sounds like what I'm trying to do here (AGPL): OneModel.
It doesn't have all the features, but what you describe is partly there, or planned for the future, though for now it's in the form of a text-only UI and you have to install postgres. The UI is something like a mix of git's "commit --interactive" and gopher (remember that, anyone?), but it is very efficient if you just read the screen and are a touch typist. Probably currently most suitable for someone who now uses emacs org-mode, or collapsible outlines of any kind, but wants to handle richer kinds of information (eg, GTD...) and a more task-specific UI.
It's what I use as my own personal organizer and knowledge manager, but ~"sharing" features for collaboration, including reputation and others, are on the wish/plan list. Feel free to use it as a starting point, or join the list for discussion. I was hoping to get the web site updated with a later binary and an enhancement, and much more information on my future plans, by roughly next week. It still lacks a convenient installer but the INSTALLING file in github is current.
If interested you could always get on the announcements list for when I add features. My health isn't great at the moment but I hope to be able to sell binaries or installers in the future for part-time income or the like. Patches or discussion on the list are welcome. I have been thinking hard about this since about 2000 and am glad to finally have something others can use, though the potential audience will be larger once there are better installers and other needed features, UIs etc.
A Free, fast personal organizer for touch typists: onemodel
Feature-wise, OM is more of what you describe for the structure of the information, right now as a personal organizer. It doesn't address the reputation question but that is definitely something I've been thinking about and seems like a fit, long-term. Nearer-term is being able to integrate data across individual instances, with reputation being a closely-related issue.
A Free, fast personal organizer for touch typists: onemodel
While I like Stack Overflow, I am planning far more structure.
More? Good grief. SO is already bad enough. Anything 'more' will simply chase users away, if they ever go there in the first place.
Easy, make all links from Fox News have +1000000 or higher, CNN -100000 or lower and MSNBC -9999999 or lower.
I just realized there's a startup bug for first-time users; I'll try to fix that & post back here in a few hours or later tonight. (sorry for not being better prepared but this seemed like an opportunity to share something useful.)
A Free, fast personal organizer for touch typists: onemodel
I have a brilliant answer to your question. But it seems like you want it answered for a big shiny price of "free". I'll keep it to myself. Oh, and if you are thinking of having a contest and hope to get my idea without actually paying for it (and no, having a contest is not it), you can forget about it. I won't submit to any such contest. If you want data analytics ideas start paying people who spend time of their lives learning how to do data analytics.
Any guest worker system is indistinguishable from indentured servitude.
Allow multiple self selected groups to provide ratings and let the reader select which rating system to use. The opinions of some raters is more important than other raters.
Well, we’ve all known for some time that Slashdot could stand to have a better reputation engine of some sort, just to filter out most of the kinds of comments I’m getting here. Be that as it may, I will try to have a conversation with the actual thinking individuals who still come here, over the noise of the trolls.
In answer to some of the protests:
If anyone thinks a few opinions, randomly thrown around, here on Slashdot can, in any way, shape, or form, constitute the bulk of the work for a graduate project, then said person has no clue as to how much work a real graduate project can be.
In any research project, it is best to gather as many ideas and opinions as possible. Only a fool would assume someone is fool enough to let Slashdot be their be-all-end-all source of information. I also have a friend who is a Research Fellow in HCI at PARC, who I have hit up for ideas and/or connections to fellow researchers. You know, it's good to get input from both ends of the academic spectrum. ;^)
The reputation engine for this project is merely an ancillary, but necessary, accessory to the real project, which is the knowledge sharing and organization system.
Any attempt to compare what I am doing within my knowledge system to some existing system, based upon the small amount of information I have provided here, is doomed to just look ridiculous. The only reason I am providing any information at all about the actual project is to provide some perspective as to the direction the reputation engine portion should take. A reputation engine for an opinion-based site, such as this one, would necessarily have a different algorithm from one designed for collecting and organizing actual information.
With all that said, based upon the general cluelessness exhibited by most web-developers and many of our "helpful" friends here on Slashdot, it seems the question of how best to design a reputation engine would be quite a viable research topic in and of itself.
Finally, anyone who thinks insulting Slashdot is a BAD THING just hasn't been on Slashdot long enough. Between the trolling trolls and the mooing cows (which I love, BTW), getting to any useful information can be a roller-coaster ride. But occasionally, the grown-ups win out and one can find some real gems. It's worth a shot, right?
At kr5ddit.com.
Instead of one user one vote, we have one kr5ddit, one vote.
We use kr5dditz, which are like karma, to determine how much you can moderate. You earn kr5dditz by moderating and and by being moderated. You can also buy and sell kr5dditz on our exchange for bitcoin.
I believe that this system should be robust in the face of sock puppets and bad actors... but time will tell.
Anyway, feel free to pop over and register, and talk with me about how it works. The site is under development, so lots of stuff is still very rough, and it is missing features I still plan to add, etc. Also, I've limited new user signup to about one every two hours or something... so, if you get rejected because of too many new users, please try again in a few hours.
For older people who do know this song and young'uns who need to become acquainted with it, and, indeed, the whole of his canon: https://youtu.be/gXlfXirQF3A Happy Whatever.
On y va, qui mal y pense!
"For my graduate project, I am considering developing a web engine designed around sharing and organizing actual information in a way that people would actually like to and easily be able to use it".
It depends on the quality of the posters, you should aim for something like news.ycombinator.com
No. He is asking users for features and characteristics that said users would find advantageous for a web engine that accumulates and organizes web data.
Users... aren't they those things that bitch about how Open Source happens to work, and then don't contribute patches back to address those complaints?
(NB: Not precisely my view, but it's going to be the typical view of most people).
Yes, I had been thinking of showing reputation scores going from 1 - 10 but allowing the internal, hidden, weights to go much higher. Research has shown it is effective to encourage novices, as in rewarding them with increasing reputation scores, but that has diminishing returns. Once people become more skilled, they respond better to specific constructive criticisms. So, I was thinking that, if someone wanted to downvote a contribution, that should have to give a specific reason that is shown only to the original contributor and moderators. Then the contributor can respond only by either editing their original contribution and clicking a button that says, "Is this what you meant?" or by rejecting the criticism and indicating if it is a troll or simply a non-preferred edit. If flagged as a troll, the criticism may be reviewed by moderators or other trusted users. If the criticizer feels strongly enough about their suggestion, they can make their own contribution that can get voted up or down, or criticized on its own.
By making these criticisms private, I think it will remove a lot of the motivation for trolling.
I've also been tossing around the idea of forcing the criticizer to select the portion of the contribution they want to criticize BEFORE entering the criticism. In most forums, Slashdot included, the respondent can choose whether to quote the original text. But then ALL the text is quoted and the respondent has to delete the irrelevant portions. Of course they almost never do. This can lead to more confusion, or just be tiresome to plow through.
I would like to point out that DICE edited my headline, which was originally, "Reputation Engine - Best Practices for Information-Based Site?" The existing headline makes it appear as if I am trying to use the reputation engine to rate the actual information. Instead, I merely want the reputation engine to cut down on the number of jerks on the site and reduce the influence of trolls, bots, and crusading armies. Once that is accomplished, I trust the "good" contributors to provide good and relatively accurate content by working together and collaborating. I do not expect any reputation engine to get to some ethereal "Truth."
Step 1: Make a minimal version of this system you're thinking of.
Step 2: Add a meta section, and ask questions or write blog posts on this subject (how to make the site better), and implement some ideas/ use A/B testing....
Step 3: Profit !
Take a look at their trust metric. It has 3 levels (master, journeyman, apprentice) and is seeded with a few "master" users known to the admins. Then all users can certify other users at any level, but masters are the most trusted etc., and trust propagates through the user graph somewhat like pagerank. It's supposed to be difficult to game, though I don't know if anyone seriously tried even back when the site was lively. It's almost dead now, but the code is still around (written as an apache module in C, yow).
Unless you solve or mitigate the big problem somewhat don't implement a reputation engine.
Reputation engines lead to echo chambers if not properly managed.
You can only implement a popularity engine, not reputation engine.
Although there are some algorithms that claim to detect trolls, avoiding pitfalls the way you want is not a possibility.