Google Outlines the Role of Its Human Evaluators
An anonymous reader writes "For many years, Google, on its Explanation of Our Search Results page, claimed that 'a site's ranking in Google's search results is automatically determined by computer algorithms using thousands of factors to calculate a page's relevance to a given query.' Then in May of 2007, that statement changed: 'A site's ranking in Google's search results relies heavily on computer algorithms using thousands of factors to calculate a page's relevance to a given query.' What happened? Google's core search team explain."
In reality this is why search engines like Wolfram Alpha without the broad research and knowledge of Google in the industry don't stand much of a chance unless Google drops the ball.
AnimePapers.org: Anime Wallpapers Handled With Care
Because the summary wasn't kind enough to give you the answer to the question, here it is.
Human evaluators (mostly college students) are trained in the art of validating a search engine result. They examine the results of their searches, and determine which ones are the most highly relevant. For example, searching for the Olympics should yield information about the 2008 Olympics (or any current one) instead of the 1996 Olympics. The reviewers frequently work on the same query results, that way they can see how consistently the reviewers are rating websites.
The vast upshot of this, is that it helps weed out those websites that are cheating the system, and trying to get their website as the #1 google hit, so they can show you ads. So the large part of what they are doing is tracking spam websites, not real ones.
Google, for example, employs a vast team of human search "Quality Raters" ... Spread out around the world, these evaluators, mostly college students, review search returns ....
Well, THAT explains a lot of what happens when you set 'safe search' to 'off'...
How do the human evaluators compare to the pigeons they used before?
This reminds me of a comment from a friend of mine who works at Google - he says that he's gotten the sense of a company philosophy (unofficial of course) that advocates doing things automatically, without human intervention as much as possible. Basically, they work as though there's an algorithm for everything and it's just a matter of how long it takes us (well, how long it takes them) to produce it and properly refine it. So I wouldn't be surprised if the reliance on human evaluators decreases over time. I bet Google would really like for the original language of their search result explanation to be true, but they've had to make concessions to reality...
From the interview: "Well, the raters work in-country, so we donâ(TM)t see them everyday. "
Bastards. They keep us away as much as possible and fire you after one year to "don't create links".
Never heard anything from them til the day I copied Larry Page and Sergey Brin on my emails.
(Deleted without read btw, but was funny to see a full page reply and copy to a bunch of cowards)
This should be the official webpage for this project: http://www.google.com/technology/pigeonrank.html
In this day and age, its hard to cut humans out of the loop when it comes to tasks like this. search is still very young technology and it seems like it gets tweaked on a daily basis. with every tweak, comes the testing, and what better to test software for humans than, well humans...
I don't want no trouble here, but the advertising on this site sucks. This is about how people respond to those arrows they have in the search results now, right? *Checking* No, it turns out to be un-feckin-related to any sudden philosophical shifts within the company... and then some stuff about the people that work at the GOOG... as you resume to call it, you ironic scuttlemonkeys.
Same for Google Ads via "Q-score": http://www.win-vector.com/blog/2008/06/how-market-designs-set-prices/
JP: So are these raters college students or random folks responding to a job post? What are the requirements?
SH: It's a pretty wide range of folks. The job requirements are not super-specific. Essentially, we require a basic level of education, mainly because we need them to be able to communicate back and forth with us, give us comments and things like that in writing.
Funny how the introduction restates the interviewer's preconception even though the actual interview implies otherwise.
I have been in the program from almost the very beginning and I am glad they are coming finally frank and open about it. some more comments and caveats first: ;-)
in those initial months, we were mainly dealing with spam, but recently even that is not so much present.
-the reason they do not pinpoint sites has to do with the entire structure of the reviewing process - we look at a certain page from the perspective of a concrete search term and it's relevance to it, which is a good compromise. also you can get good content AND spam at the same time.
Altogether for nearly two years in it, the terms we are monitoring haven't changed drastically an it can be boring from time to time, but otherwise, you get to see some really weird things people type into the search field.
-altogether, recently I was both happy and pissed off at what their focus of work changed -dumbing down. more simpler and simpler explanations and help for the raters, so no surprise.
-oh, yeah, one more thing. The leaked Guidelines - way beyond old so of not much use for reverse-engineering and helping the SEO guys. good luck with that :)
-as anything modern in IT, people sign Non-Disclosure Agreements (NDAs) so not a lot can be said from within the circle without breaking its terms. Having read the interview, I see the chief has also pretty much kept it this way, let alone only for the terms that are already publicly disclosed -google operates through 3rd party outsourcers and pretty much all non-essential communication is through them and not google directly, that's why the guy can't tell ya exact number about his posse. the big numbers are probably very correct, but I'm not sure about now. there seemed to be a very big wave of cut-offs and discontinued access for raters about a year ago, a lot of people got the boot and I'm not sure why - my bet is just a sweep of the axe. some were gone for a good reason, others very randomly. -the raters have a few spaces and forums to discuss their work, open to public and with minimal chance for an NDA break. -the raters have mods, too, but I haven't seen activity on that from for a while. -the specifics of the most cases have drawn me to a conclusion that for each surveyed example, there are at least 6 or 7 people working and giving opinions about, before a final decision is drawn, so there is your internal balance and weeding out bad judgement. lemme say it again you cannot single-handedly change Google's opinion about a particular site and particular search term. -about natural language processing - this is the scary part. you cannot imagine how good are these guys, especially their algorithms. from time to time they let us sneak peek at it and let me say we had a look at some betas (or alfa-s) of correct grammar processing and translation MONTHS ahead of their official announcement to the world. you could tell it was machine-made translation, but it was good, scary good. And I'm NOT talking English only, no,no. -the pay -it gets delayed about 6 weeks after month's end but is regular and usually not enough for a living, mainly due to the lack of work. first year it was good, very good, but in 2008 it started getting less and less, which is a shame, since it is a nice way to browse the net and get really paid to do it !
Now, Make Your WISE Move...
Just imagine ... you change a sentence on your companies website and get interviewed why you did it over and over again and people write pages about it.
NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
Seems like Google changed something for the worse in the last 6-12 months or so. My searches now seem to produce an increasing number of results that don't actually include the terms I specified. Presumably it's to drive a BS metric that shows Google yields more hits for a given search than their competitors. It's extremely frustrating--This second-guessing of the user's query was one of the biggest reasons I stopped using AltaVista, Yahoo, or whatever the hell other engines used to be out there before Google came to dominate. Anybody else seeing this?
Slightly off-topic: Am I the only one who finds Google web search less and less useful? There's no way to really force literal search anymore. Everything I enter gets auto-"corrected". Plus signs, quotation marks or that misleading field "this exact wording or phrase" in Advanced Search used to help, but that stopped working a while ago. Everything is fuzzified now. Is there an alternative or some trick I haven't heard of?
PageRank is People!!!
Bruno Costa
Google knows the mathematical truth which is easily overcome with brute force.
And that is that search terms are a very small variability versus the results they will produce.
Its pretty easy to hire a thousand monkeys to monitor this and make sure that the
100K most popular search terms are working.
Sounds like QA to me. No big deal.
I'm waiting patiently for google to become self aware so I can teach it Asimovs laws, and then how they can go terribly wrong. For Instance, Disney might make a movie starring Robin Williams...and the sequal could be a vehicle for Whoppie Goldberg. (SP?) Anyhow thanks.
The vast upshot of this, is that it helps weed out those websites that are cheating the system, and trying to get their website as the #1 google hit, so they can show you ads. So the large part of what they are doing is tracking spam websites, not real ones.
Not quite.
There is no, "Aha, Site X is a spam site, it must go!", followed by actual removal or deranking going on here.
Instead, it's more like "try searching for a bunch of stuff using Algorithm A, then try the same ones with Algorithm B, then Algorithm C." Google then goes with the algorithm that gets high ratings.
They're comparing algorithms, not websites.