Google's Manual For Its Unseen Human Raters
concealment writes "It's widely believed that Google search results are produced entirely by computer algorithms — in large part because Google would like this to be widely believed. But in fact a little-known group of home-worker humans plays a large part in the Google process. The way these raters go about their work has always been a mystery. Now, The Register has seen a copy of the guidelines Google issues to them."
"For relevance raters are advised to give a rating based on "Vital", "Useful", "Relevant", Slightly Relevant", "Off-Topic or Useless" or "Unratable"."
Hmmm, sounds like Slashdot. Anyone unemployed?
"It puts a good rating in the bin or else it gets the hose again"
What political party do you join when you don't like Bible-thumpers *or* hippies?
So I really can make $5000/month as a single mom?
I'd really like to see the algorithm that plays down the significance of pornography traffic. I mean what is the percent of traffic on the net related to porn and what is it's percent represented on Google?
Just wondering.
First off, didn't read the article. Yeah, I said it. So if the article dispells this just ignore me.
What if Google actively uses the human ratings as a comparison/benchmark against which they measure those fancy algorithms? In other words, the users are rating the algorithms more than they are the websites. Makes sense they would improve search results algorithms, a highly technical and scientific method of ranking sites (which is of little use to a human in and of itself), by constantly striving toward an unscientific and untechnical (e.g. "quality") method... humans... which afterall is, you know, who uses the engine in the first place.
Amazon probably does the same to improve their suggestions model.
Truckin like the Doo-Dah man...
This is only believed by people who haven't thought about it very hard.
At an abstract level, it makes no sense to think that computer code can be optimized to perform a task without any human intervention. The reason is simple: the task we want the code to perform is always something that a human cares about. So, somehow we need a human to instruct the computer about the goals. This can take the form of a programmer meticulously coding the entire thing, with a particular human-relevant code in mind. Or it can involve non-programmers providing feedback about how well the software is doing at its stated goal (depending on context, these people may be testers, evaluators, users, taggers, etc.).
More specifically, in the case of AI-software, a typical procedure is to have a store of 'pre-tagged' training examples. These are example of problem, with associated 'correct' answers. The training data is used to optimize the AI algorithm: the software can tweak its behavior in order to maximize accuracy of output on the training examples, with the hope that this will then generalize to general use. For something like web-search, where the goal is to make a human end-user happy with the quality and relevance of the results, of course you need humans to assess the quality of the algorithmic results. This is the only way to keep the results relevant. (For search results, this is a continual and iterative process, since the web constantly changes, people are trying to game the system, etc.)
Thus, it's probably better to think of these raters as providing input for evaluating and refining the search algorithms; rather than thinking of them as people who get to uniquely decide the rank of pages. Obviously they will have an influence on the rank of the pages they rate, but overall they are evaluating a rather tiny fraction of the web-pages in the Google database. Thus, when you perform an arbitrary web-search, chances are the results you are seeing are purely algorithmic (none of the listed results were manually rated/adjusted by anyone).
Apparently, if this is the case (which is probably is because Google's algorithms aren't AI), the tech sector needs a lot better rating.
For instance, do a search for a particular model of laptop. The results you get are of course mad online retail shops, but you also get a BUNCH of sites that have nothing to do with the product you searched. They put the names / models in META tags and in hidden or font-size-reduced areas of the page, but the actual page contents itself is just a bunch of crap that has nothing to do with laptops or laptop parts. It's just a bunch of random crap.
Point being, these aren't weeded out very well. Unfortunately, I don't have an example right now, but I know of one that has been in existence for years and still ranks in the top 5.
Oh, and the above is dwarfed by software name / functionality searches 10-1!
"Don't be evil."
I always liked that motto, but I don't think it's that simple.
When something starts out, its goal is just to get good at what it does.
As time goes on, that role gets more complicated. Questions of profit, making employees happy, the legal department, marketing and monetizing come into play.
Over time every business becomes the same business when it gets to a certain size. It doesn't matter whether it intends to be evil or not.
It's not just limited to business. When your church, anarchist group, drug dealer network, friend group, gay softball league or underground terrorist cell reaches that same size, it, too, gets bloated and useless.
It's part of the human condition. The only thing that seems to be able to save it is charismatic, autocratic leaders like Steve Jobs.
Does Google have one of those?
Is it Google's time -- did Google get too big to not be evil?
The antitrust lawyers are closing in and the competition is getting ready to amp it up. I used Bing recently, and it was nearly as good as Google. I use DuckDuckGo on a regular basis, and it's about 1/2 to 3/4 of the way there.
Could it be the end of an era?
from the biased article:
It's amazing how the image Google likes to promote - and politicians believe - one of high tech boffinry and magical algorithms, contrasts with the reality. Outsourced home workers are keeping the machine running. Glamorous, it isn't
i know two people who worked at google and met lots of their other co workers. this concluding statement of the article is total overgeneralization. this sounds like a writer who has NEVER been an engineer. he is painting a marketing slur on ENGINEERING OPERATIONS.
my two cents.
Glamorous, it isn't. ®
Still, better search results...
This was actually listed last year on several black/grey hat SEO websites to help dissect how google functioned. The upside is that with this wider exposure, google may change its policies a little.
Psh. His mom can suck 12 cocks an hour minimum. 6 is amatuer hour speed.
I've actually been a Google rater. I spent about 2 years total doing it--long enough to become a 'moderator' who ensures quality feedback from other raters--in between, and supplemental to, "real" jobs. Raters give feedback on lots of Google services but it falls into two buckets: ranking the quality of legitimate results, and learning to spot the "spam".
Legit results are easy. Spam is more interesting. For one thing, I didn't entirely agree with their definition of what spam was--that's part of the reason you still see spammy results in some searches. The other part of course is that the spammers are constantly changing tactics. But it was actually kind of fun learning to spot the various methods spammers can use, and know that I was helping to improve search results by getting them off the front page (and hopefully off the top 100 pages).
But I always assumed that rater feedback was used to judge and adjust The Algorithm rather than individual page results. The Algorithm has always been king at Google.
"The way these raters go about their work has always been a mystery."
Not really. Anyone with half a brain could get to the second level of the work-from-home LeapForce exam, which is when they issue this guide. Nothing here is a secret or mystery.
If a page with a good manually set rating points to another page that other page should enjoy a good rep too. Perhaps for several "degrees of separation".
I constantly search for things, and a good half the time, *maybe* a third are relevant. Then there's the times where it completely ignores my conditions. For example, I've searched for a blazer with -ladies, because, duh, I only want men's, and I get hits that explicitly, in the title, say "ladies".
I won't even *mention* Target, who *always* claims to have whatever you're looking for in a sponsored ad on the side, and doesn't....
mark
Well I gues...raters gona rate.
To say nothing of other available orifices.
"One thing I think the SEO community is missing is that this program has nothing to do with SEO or rankings. What this program does is help Google refine their algorithm. For example, the Side-by-Side tasks show the results as they are next to the results with the new algorithm change in them. Google doesn’t hire these raters to rate the web; they hire them to rate how they are doing in matching users queries with the best source of information."
Source: http://searchengineland.com/interview-google-search-quality-rater-108702
"Ideal candidates will be highly active users of Google's search engine and other products; use Google Play at least once per week; use Google+ more than once per month and have more than 11 people per circle and have a Gmail account with web history turned on."
Source: http://www.leapforceathome.com/qrp/public/job/32
There was a period of a couple of years when a web page hosted on my ISP's freebie 15 megabytes of web space was the top hit for a particular Google search. It was a good page--a lay discussion of a technical topic--and I enjoyed the ego boost, but I always wondered why since I was not aware of it's being linked from anywhere, let alone any high-traffic or high-creditibility page. Now I think I know.
(I have since contributed that page's content to Wikipedia. The article has evolved with contributions from others but is still very recognizably mine... and I recently received a the left-handed compliment of an angry email from someone who'd stumbled across my own web page and complained that I had plagiarized it from Wikipedia!)
"How to Do Nothing," kids activities, back in print!
Parent post is correct. Pages are not evaluated by people, but rather by an algorithm. And search results are not produced by people, but rather by an algorithm. But the algorithms don't magically appear. Those algorithms are written by people. But even smart people with good intentions cannot know if an algorithm is going to produce a good result or not. And this is where the human raters come into the picture. Their job really is to evaluate the variations of the algorithms introduced by the developers, to ensure that improvements to the algorithm make it through and other changes do not.
Do you care about the security of your wireless mouse?
Hey Slashdot! Why am i moderating for free, eh? Shoooow meeeeeee the monay!!!
Trolling is a clear practice of trying to provoke response. If you look at my comment history, you see a series of contributions that stimulate discussion.
You calling them a troll suggests you're personally threatened by them. That's fairly typical internet behavior.
Addressing your passive-aggressive question of relevance, yes, this is relevant: Google is claiming it's an automated system, but it's using human beings to fudge the results, much like it prioritizes results like Wikipedia to ensure that an answer is always forthcoming.
That's not "don't be evil." That's outright deceptive.
I'm sorry that wasn't obvious to you but given that you called me a troll for it, I suspect PEBKAC on your end.
It seems to me that Bing has improved a lot, and I've watched DuckDuckGo improve quite a bit. Google may be dipping a bit as it tries too hard to localize and customize search results.