Ask Slashdot: Best Practices For Using a Reputation Engine To Rate Information?
GrantRobertson writes: For my graduate project, I am considering developing a web engine designed around sharing and organizing actual information in a way that people would actually like to and easily be able to use it. Unlike a wiki, the information will be much more granular with lots more metadata and organization. Unlike a web forum, the information will be be organized rather than dispersed throughout thousands of random posts, with little room for dominant personalities to take over. While I like Stack Overflow, I am planning far more structure. While I enjoy the entertaining tangents on Slashdot, I don't want those to take over sites created using my engine. Naturally, there must be some way to prevent armies of bots or just legions of jerks from derailing web sites created using this engine. Given that, what would you say are some good rules to include in the reputation engine for such a site. What kinds of algorithms have you found to be most beneficial to the propagation and spread of actual knowledge. What would you like to see and what have you found to be dismal failures?
you are counting on Slashdot to do your graduate project for you? That is a horrible idea in so many ways...
Pretty much any set of algos is going to be easily defeated by humans trolling and no system is going to be anything near perfect. My thoughts;
1) Create a small set of simple, concise rules that are inviolate
2) Have a system so people can mark submissions as good (no rules broken/useful) or bad(rules broken)
3) Have your referees do nothing but determine if that submission is breaking one of your rules
4) Based your user trust as a derivative as how the user voted compared to what the referee votes
The theory is any controversial submission is going to get flagged & referees attention. Their job is limited in scope to just determining if the post breaks the site rules or not, nothing to do with quality / content / opinion. If users are trying to game the system their votes are going to conflict with the referees so their user trust is going to go down, whereas if people agree their trust is going to go up.
Eventually you'll have a group of users that you can generally trust to do the right thing so you can weight their actions accordingly.
Obviously there are some weaknesses;
- Referees are pretty much god (that's why the scope of their power is extremely narrow and simple)
- You can end up with hive mind (though you can combat that if enough trusted users conflict with other trusted users). I'd argue it's a way better protection than pure crowdsourcing ala reddit where the demographics crush submissions into hivemind
Just tossing that out there off the top of my head. It's not something to replace automated reputation management, just something augment it and limit some of the abuse.
Should someone inform their thesis advisor that they are getting others to do his work for him?
Isn't the whole point of thesis work that you find some novel solution to a problem through your own research not enlisting others to do it for you?
"Unlike a wiki, the information will be much more granular with lots more metadata and organization."
Pretty sure the ideals behind a semantic web were supposed to cover this part. Never really took off though because, I think, people are to lazy to sort data to that degree of detail and the algorithms necessary to process and categorize human text with that level of granularity seem to be very hard to make.
Look into mTurk (Mechanical Turk). Amazon doesn't provide a reputation engine, but anyone who posts any significant number of jobs there has some kind of version of it. I worked for several years on a project that integrated with mTurk and had its own reputation engine. There are a lot of gotchas where people try to game the system. It isn't a simple answer and depending on the situation I don't believe there is a one solution for all situations. Bill
the only reputation system that will ever beat legions of jerks will have to be able to determine if the information itself is correct. when dealing with jerks, you need to remember they are humans, the most cunning and devious of superpredators. jerks will build a good reputation by giving good answers just to destroy the reputations of others or build up reputation of jerks that give bad answers. no system you come up with will be infallible.
Anons need not reply. Questions end with a question mark.
...the power of jerks!
Artificial intelligence is no match for natural stupidity!
Sounds like what I'm trying to do here (AGPL): OneModel.
It doesn't have all the features, but what you describe is partly there, or planned for the future, though for now it's in the form of a text-only UI and you have to install postgres. The UI is something like a mix of git's "commit --interactive" and gopher (remember that, anyone?), but it is very efficient if you just read the screen and are a touch typist. Probably currently most suitable for someone who now uses emacs org-mode, or collapsible outlines of any kind, but wants to handle richer kinds of information (eg, GTD...) and a more task-specific UI.
It's what I use as my own personal organizer and knowledge manager, but ~"sharing" features for collaboration, including reputation and others, are on the wish/plan list. Feel free to use it as a starting point, or join the list for discussion. I was hoping to get the web site updated with a later binary and an enhancement, and much more information on my future plans, by roughly next week. It still lacks a convenient installer but the INSTALLING file in github is current.
If interested you could always get on the announcements list for when I add features. My health isn't great at the moment but I hope to be able to sell binaries or installers in the future for part-time income or the like. Patches or discussion on the list are welcome. I have been thinking hard about this since about 2000 and am glad to finally have something others can use, though the potential audience will be larger once there are better installers and other needed features, UIs etc.
A Free, fast personal organizer for touch typists: onemodel
Feature-wise, OM is more of what you describe for the structure of the information, right now as a personal organizer. It doesn't address the reputation question but that is definitely something I've been thinking about and seems like a fit, long-term. Nearer-term is being able to integrate data across individual instances, with reputation being a closely-related issue.
A Free, fast personal organizer for touch typists: onemodel
While I like Stack Overflow, I am planning far more structure.
More? Good grief. SO is already bad enough. Anything 'more' will simply chase users away, if they ever go there in the first place.
I just realized there's a startup bug for first-time users; I'll try to fix that & post back here in a few hours or later tonight. (sorry for not being better prepared but this seemed like an opportunity to share something useful.)
A Free, fast personal organizer for touch typists: onemodel
I have a brilliant answer to your question. But it seems like you want it answered for a big shiny price of "free". I'll keep it to myself. Oh, and if you are thinking of having a contest and hope to get my idea without actually paying for it (and no, having a contest is not it), you can forget about it. I won't submit to any such contest. If you want data analytics ideas start paying people who spend time of their lives learning how to do data analytics.
Any guest worker system is indistinguishable from indentured servitude.
Allow multiple self selected groups to provide ratings and let the reader select which rating system to use. The opinions of some raters is more important than other raters.
Well, we’ve all known for some time that Slashdot could stand to have a better reputation engine of some sort, just to filter out most of the kinds of comments I’m getting here. Be that as it may, I will try to have a conversation with the actual thinking individuals who still come here, over the noise of the trolls.
In answer to some of the protests:
If anyone thinks a few opinions, randomly thrown around, here on Slashdot can, in any way, shape, or form, constitute the bulk of the work for a graduate project, then said person has no clue as to how much work a real graduate project can be.
In any research project, it is best to gather as many ideas and opinions as possible. Only a fool would assume someone is fool enough to let Slashdot be their be-all-end-all source of information. I also have a friend who is a Research Fellow in HCI at PARC, who I have hit up for ideas and/or connections to fellow researchers. You know, it's good to get input from both ends of the academic spectrum. ;^)
The reputation engine for this project is merely an ancillary, but necessary, accessory to the real project, which is the knowledge sharing and organization system.
Any attempt to compare what I am doing within my knowledge system to some existing system, based upon the small amount of information I have provided here, is doomed to just look ridiculous. The only reason I am providing any information at all about the actual project is to provide some perspective as to the direction the reputation engine portion should take. A reputation engine for an opinion-based site, such as this one, would necessarily have a different algorithm from one designed for collecting and organizing actual information.
With all that said, based upon the general cluelessness exhibited by most web-developers and many of our "helpful" friends here on Slashdot, it seems the question of how best to design a reputation engine would be quite a viable research topic in and of itself.
Finally, anyone who thinks insulting Slashdot is a BAD THING just hasn't been on Slashdot long enough. Between the trolling trolls and the mooing cows (which I love, BTW), getting to any useful information can be a roller-coaster ride. But occasionally, the grown-ups win out and one can find some real gems. It's worth a shot, right?
See, now here is some of that gold spoken of earlier. While I am somewhat familiar with deep learning, I hadn't thought of using it to mine trust information out of the entire database of comment and voting information. Possibly across an entire swath of associated sites.
Remember, this fuzzy stuff is not my strong suit. My real project is organizing hard information into hierarchies. Kinda the opposite of sussing out the real intent of something as mushy as anonymous internet users. I was just going to try to go with a few basic probabilistic algorithms with some simple rules and be done with it. This really WOULD be a good research project all by itself.
Thanks, Gilligan!
Actually, it did make sense. Still does. Most just can't make sense OF it, and it doesn't make it easy to sell eyeballs. However, that is yet a different project, which I may or may not have time to work on in my lifetime. I call it "Web 0.0"
At kr5ddit.com.
Instead of one user one vote, we have one kr5ddit, one vote.
We use kr5dditz, which are like karma, to determine how much you can moderate. You earn kr5dditz by moderating and and by being moderated. You can also buy and sell kr5dditz on our exchange for bitcoin.
I believe that this system should be robust in the face of sock puppets and bad actors... but time will tell.
Anyway, feel free to pop over and register, and talk with me about how it works. The site is under development, so lots of stuff is still very rough, and it is missing features I still plan to add, etc. Also, I've limited new user signup to about one every two hours or something... so, if you get rejected because of too many new users, please try again in a few hours.
For older people who do know this song and young'uns who need to become acquainted with it, and, indeed, the whole of his canon: https://youtu.be/gXlfXirQF3A Happy Whatever.
On y va, qui mal y pense!
"For my graduate project, I am considering developing a web engine designed around sharing and organizing actual information in a way that people would actually like to and easily be able to use it".
It depends on the quality of the posters, you should aim for something like news.ycombinator.com
No. He is asking users for features and characteristics that said users would find advantageous for a web engine that accumulates and organizes web data.
Users... aren't they those things that bitch about how Open Source happens to work, and then don't contribute patches back to address those complaints?
(NB: Not precisely my view, but it's going to be the typical view of most people).
Yes, I had been thinking of showing reputation scores going from 1 - 10 but allowing the internal, hidden, weights to go much higher. Research has shown it is effective to encourage novices, as in rewarding them with increasing reputation scores, but that has diminishing returns. Once people become more skilled, they respond better to specific constructive criticisms. So, I was thinking that, if someone wanted to downvote a contribution, that should have to give a specific reason that is shown only to the original contributor and moderators. Then the contributor can respond only by either editing their original contribution and clicking a button that says, "Is this what you meant?" or by rejecting the criticism and indicating if it is a troll or simply a non-preferred edit. If flagged as a troll, the criticism may be reviewed by moderators or other trusted users. If the criticizer feels strongly enough about their suggestion, they can make their own contribution that can get voted up or down, or criticized on its own.
By making these criticisms private, I think it will remove a lot of the motivation for trolling.
I've also been tossing around the idea of forcing the criticizer to select the portion of the contribution they want to criticize BEFORE entering the criticism. In most forums, Slashdot included, the respondent can choose whether to quote the original text. But then ALL the text is quoted and the respondent has to delete the irrelevant portions. Of course they almost never do. This can lead to more confusion, or just be tiresome to plow through.
I would like to point out that DICE edited my headline, which was originally, "Reputation Engine - Best Practices for Information-Based Site?" The existing headline makes it appear as if I am trying to use the reputation engine to rate the actual information. Instead, I merely want the reputation engine to cut down on the number of jerks on the site and reduce the influence of trolls, bots, and crusading armies. Once that is accomplished, I trust the "good" contributors to provide good and relatively accurate content by working together and collaborating. I do not expect any reputation engine to get to some ethereal "Truth."
SWEET!!! This is exactly the kind of reference I was hoping to receive. All the trolls were worth this one reference. This site, and the project members themselves, will be a vast goldmine of information and potential collaboration. I don't know if I ever would have come across this on my own.
Thank you so much.
Grant