Google Letting Users Rank Search Results
Myriad writes "C|Net News is running an article about Google testing out a new system which would let users rank pages. From the article, 'Two weeks ago, Google began quietly testing a Web page voting system that, for the first time on a large scale, could eventually let Web surfers help determine the popularity of sites ranked by the company's search engine.'" As someone who has a lot of experience with systems where users self rate content, let me just wish Google the best of luck. Especially since for many unscrupulous businesses, ratings in search engines directly translate to dollars.
Perhaps they could disqualify corporate business websites from being ranked.
Thanks,
Travis
forkspoon@hotmail.com
Well as the marketing director for an unscrupulous business, let me be the first to say how much I am looking foward to being able to rate my competitors' websites on one of the most popular search engines.
this is certainly a valid problem to try and solve. for example, i just searched for "clueless phony" and jon katz's name was nowhere to be found.
an IP address doesn't neccessarily equate to a person. Companies can have thousands of IPs and google can't tell if its just one entity or 3000. I would predict that if this goes into effect the gator advertising thing thats bundled with just about any free download these days will be modified to rank up pages of those who pay them the most.
--
WHO ATE MY BREAKFAST PANTS?
No, that's the point. The new system encourages it. The previous metric used by Google was weighted links-to to determine the value of the look-up. I think the rating system is far more vulnerable to abuse.
that is all that will happen, how are they going to stop multiple "votes"? by a cookie (that the voter can erase)? By tracking IPs (they wont put the resources into that large and complex of a system?
I wish it would work, but it will be an abismal failure... in fact it wouldnt suprise me if some corperations hire people just to "vote" for their sites...
just look at ANY top 50/100 voting sites and you know what I am talking about
Thanks to file sharing, I purchase more CDs
Thanks to the RIAA, I buy them used...
People who specialize in pushing sites into the top rankings--a technique known as search engine optimization--say the company's success has made Google a new frontier to conquer. And they assert that its system, like any other, can be outsmarted.
:)
This is particularly repugnant, especially given the goals set in the article (Google wants to make the search engine process more of a democracy, etc.) Is anybody else tired of soulless marketdroids essentially destroying all the good things that are the Net(C)(TM)(R)?
On the bright side, maybe there's room to add Slashdot-styled moderation and meta-moderation to Google rankings - imagine a "+1 Funny" rank for the Onion or a "-1, Offtopic" page rank for every time you go surfing for something honest and end up at Yet Another pr0n Site.
But what does my opinion matter, I just vote here. It's not like I have any money or anything.
While on the subject of Google, there is an interesting article at The Register detailing how search terms are used to exploit servers, switches, routers, etc.
m l
http://www.theregister.co.uk/content/6/23069.ht
Though Google claims the voting system won't directly, and more importantly, immediately, have any effect on results of a search, I think they're going to have to spend a lot of money on abuse detection.
Even if the system works fine (i.e., without abuse), it would be nice if the user still have the option to use it or the not (as the current system works very well).
:)
Better yet, they could have a slashdot-like user customization mechanism (i.e., where the user can set the threshold and moderate/vote a search result in many ways).
Anyway, I wish them luck too (Google rules
How you do it: After putting the page up, write a tool to hit google's voting engine over and over and over... giving yourself good ratings.
Question: How would the system prevent this type of abuse from happening - especially the opposite approach - rating competitors' sites poorly to drop them in the list?
Devil's Advocate Question: If you don't allow this abuse to occur, doesn't that then unfairly give extra ranking to sites based on age? A new site won't have accumulated as many votes as an old one yet, and so the ranking would always favor old (and likely to be out of date) sites over new ones.
Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.
What I wanted then was a "moderate" button I could click beside the link to indicate that it was spam. With a voting system like this, Google could locate and remove spam a lot quicker. Maybe that's what this is all about.
Doug Moen.
I have written a truly remarkable program which this sig is too small to contain.
To establish such a system, Google needs to get users to create accounts. A more feasible solution may be cooperation with instant messaging providers, using their identity pool and friends lists as filter criteria. But if they want people to create accounts, they need to turn Google into a community. The first thing to do this would be to have an automatic discussion forum for every major website.
That, again, would create a lot of traffic, so they might be better off using a peer-to-peer app residing on the users' systems instead, which would also allow you to add website-specific real time chat, file sharing, micropayments and other nifty things. It would also make it easier to create responsive user interfaces, which is always a problem with web UIs.
- Don't put "rate this sight" next to every hit. Instead, use a system of random assignment. Every x(where x is a random number) hits, give the user a "rate this site" dialogue. This cuts down on the potential for direct abuse.
- Add an option to sort by user rating, or sort by the current standard. This way, if people don't want to see user rated results, they don't have to.
I love google and all, but some of the things that make it to the top of the list from time to time are as useful to me as a 16 bit dos driver (for my RS/6000). It'd be good to see something resembling peer review on the web after all. Who knows, even if it fails, it might spark something that works! Best of luck google!If I can't see it in Lynx I'm not interested.
Why not just monitor which links searchers choose?
Amazing magic tricks
It's gonna keep happening... A "new" search engine comes out with little bias... First it was Altavista, then Google. But after spending millions on hardware, software, and personnel, these companies realize "hey, this is cool, but I think our owners want us to make some money." There'll be a new bias-less marketing-free SE after Google, and after a while, their owners will ask them for some profit. It'll keep happening for as long as I can tell. But, all that said, I'm happy with Google right now ;)
i trust google , they say something i believe it. i have no reason not to. they have always had a consistent history of being honest. politicions though, probably not the most honest indviduals out there.
Currently, Google's proprietary system ranks sites primarily by words listed on the page, terms used in a page's title or similar factors. It also ranks a page's popularity, determined by the number--and importance--of sites linking to a page. For example, a page that is linked to 100 times from a reputable newspaper's Web site would rank higher than a page linked to 500 times from a porn site.
I do like this feature, it truly shows the worth of a web page in other peoples eyes, not just the eyes of the webmaster who created the thing...
Thanks to file sharing, I purchase more CDs
Thanks to the RIAA, I buy them used...
I don't know how many times I have searched for something and get a perfect looking search result only to find out it is a broken link. I have not used all of the search engines out there but I don't remember any of the ones that I have used having an obvious method to flag a link as broken.
I know that their spiders go through the database and verify links but I'd be willing to bet that is takes months to go over it once. Why not flag links as broken and have the spider verify/remove those first?
Just cleaning up the broken links could improve the search results.
Help out Project Gutenberg!
Distributed Proofreaders http://charlz.dns2go.com/gutenberg
This feature is only available from the 'Googlebar'.
The problem is that this GoogleBar only plugs in Internet Explorer, so *nix geeks won't be able to rate sites..
It consists on small faces on which you click. (happy or unhappy)
-J
Alexis 'jeriqo' BRET
This is truly idiotic, since robots.txt has never been a default part of any web server installation I've ever done, so it's completely a voluntary thing to create the file, and every webmaster should be WELL AWARE what this file does (by virtue of the fact that they had to create it). I mean, duh guys.
Yeah, so I'm off topic. But I just got the spam this morning, and I used to respect Google quite a bit, and witnessing them resorting to spam emails, begging us to let them spider our sites really tarnished their image, so let me rant a little. :p
Oh, and let's not forget about google suggesting robots.txt as a method to protect sensitive data recently. Be nice if they could decide if they wanted us to create robots.txt, or not..
I have seen all kinds of warez sites that force you to vote in order to get to parts of the site. Others could have frames that forge a vote each time a visitor comes to their site. While this is an intriguing idea, I don't see how it could work.
The whole idea of Google's PageRank was to count each link from another indexed site as a vote. What was wrong with that scheme? Doesn't everyone currently think Google is the best engine out there? If so why "fix" it?
I like the suggestion someone else made about showing the vote results but not having them acutally affect the search results.
Why aren't we told when editors moderate our posts?
Especially since for many unscrupulous businesses, ratings in search engines directly translate to dollars. Taco you moron have a little faith. This is google we're talking about. Name one feature they've screwed up that badly. If it can't be done so that companies can't take advantage of it realise that it won't be done at all. And taco, you ignorant bastard, you'd think that after you created a user-ranked web site companies can't take advantage of you would realise that anyone can do it.
I can't spell or type, but that doesn't mean I'm unusually stupid.
The existing Google ranking system is already exploited by users who set up hundreds of dummy sites that all link to a certain site using a variety of keywords, thus feeding the G! machine bogus "popularity" information.
A ranking system will just make this easier to do. Your average skript kiddie could easily bombard Google with a heap of "Yeah this is great!" ratings for his site, thus bumping it up many notches.
User-ranking systems work as long as there's no huge desire to do so. Slashdot doesn't have *too* many problems because nobody really cares that much if they get rated "+6 - Rad!"...however, there's a much greater motivation to have one's website come up tops in one of the most popular search engines....
Got Rhinos?
Actually they do. The sponsored links show up first and are clearly indicated to be paid for.
Personally I think their system ain't broke, though, so why fix it?
I do not have a signature
Think about it. According to the article, the system is currently just collecting information, it isn't affecting rankings -- yet. So in a couple of weeks Google will look at this new data, look at the corresponding pages, then figure out what should be done. Why are we assuming that they will just do a linar mapping between the number of happy faces and relevance?
I wouldn't put it past them to dynamically map relevance with a far more complicated function. User rankings are another non-random data stream. All information (even negative information) is useful. Just as long as one strips it from its labels, and looks at it blindly. Can you say neural networks?
Slashdot monitor for your Mozilla sidebar or Active Desktop.
The idea of rating based on user "votes" is one I see bound to failure. Google would need an enormous trusted user base, and logins would be required ('cause I could spoof any IP out there for votes). Talk about unnecessary complexity for a search engine.
What is more interesting is what a few companies have been doing recently in the search engine world (there really still is business after the dot-com fallout, even if it isn't profitable). At my work, we recently looked into a product by a company called Recommind. Their search engine was able to find similar words in documents, and could give you related documents that didn't have key words. It could even distinguish between java (the coffee), java (the language), and Java (the island near Jakarta)! Pretty cool stuff. Combine that type of "concept matching" instead of "keyword matching" with Google's technology, and you've got the next generation search engine.
All very cool stuff. I hope they don't kill it.
I think the bias comes from the web site owners, not the search engine owner. After a "new" search engine becomes popular, all the spammers target it until they figure out how to manipulate that search engine's ranking. And then that opens up the field for a search engine using a completely different type of ranking. Parasite adaptation.
Provided that they can keep users from voting multiple times through ip tracking (can you imagine the size of the database for that), they will probably run into the same symptoms that mp3.com's top 40 boards had, there were usually the same group of artists or songs on that board because few people ever explored the rest of the mp3.com archives. But maybe since google isn't the place housing the content, it will be different.
Especially since for many unscrupulous businesses, ratings in search engines directly translate to dollars.
But we've all seen first hand how easy it is to stop unscrupulousness through meta-moderation!
ok then your [sic] infringing on my copyright! Could you as [sic] me next time before STEALING my comments for your own?
It's not that hard to make it really expensive to forge votes. For instance, check out the captcha project at CMU. (Basically, it generates images that are difficult for a computer to recognize, but easy for a human, and challenges the user to respond to them in some way to prove that they are human.) If they could find the right balance of convenience for humans and difficulty for perl scripts, I think they'd have a great thing going. I have always wanted this feature in a search engine
This is a great idea. Google already had an excellent idea in rating sites by counting the links to the site to defeat keyword clogging, and now they're adding one more layer to the quality control.
Hats off to Google, yet again. Keep up the good work, guys.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
Rather than using the votes to tinker with the specific rankings of particular pages or sites, he said, the feature would most likely be used to bolster the relevance of overall results.
"It will most likely have more of an aggregate impact," Krane said. "We have indexed more than 1.6 billion Web pages, so it is extremely inefficient to go after individual pages."
Also remember that this is only one of many of Google's tools to improve relevance. You can already do your part to stop spammers by reporting them to search-quality@google.com.
"Reality is just a convenient measure of complexity" -Alvy Ray Smith
is if i had a way of decreasing the ranking of my own site for particular search terms.
eg, my site used to be called '/dev/random' but i changed the name when i realized that it was in the search engines for that term and that most people who were searching for '/dev/random' probable weren't looking for my weblog. i'd love to have some kind of 'anti-keyword' meta tag that i could use to tell the googlebots that i'd rather not be associated with that search term anymore.
i know... somewhat off topic and boring... sue me.
Smokey the Bear says, "Strip mining prevents forest fires!"
As slashdot got Meta-Moderation, i think google should use Meta-Rating, so users could help detect spammers.
/metamod.pl.
Oh, by the way, if you're already a Slashdot moderator and want to know if you can Meta-Moderate, just check
-J
Alexis 'jeriqo' BRET
Is there a provision for meta-modding at Google?
-- @rjamestaylor on Ello
I tend to trust Google too, but I think you under-rate the devious abilities of bored corporate minions to work around rating system restrictions that are supposed to keep them from artificially boosting their site's rating. I give it a week or two before someone comes up with a script that relatively quickly and easily will artificially elevate a rating. This sort of thing is too easy to abuse, especially for a necessarily open system like a public search engine.
No relation to Happy Monkey
There was an article on New Scientist about some technology similar to this. It would analyze what parts of a web page were hit the most, and bring those to the foreground (think bigger, bolder links), and shrink or kill off the unused links.
It's all part of the process of creating a more "intellegent" web.
Ok you make a page that has all sorts of info on a subject like 1966 Mustangs. The page has everything you ever wanted to know about the car. But you secretly put something in the page, like a server-side redirect, that takes the user and sends them to a page about Cameros. So you submit the page and the search engines give a high ranking because the page has a lot of good info on Mustangs. But people that go to the page get sent to a different page that talks about something else.
Whats funny about this is that the search engines already know this. The Marketing Director at the company I work for told me that this hasn't worked in a couple of years. Some engines send a second agent out to see if the page the page at that link is the same as one that got indexed. I think this a case of whoever wrote the artice isn't up to date on search engine technology.
Hollow words will burn and hollow men will burn.
The article clearly states that Google will use the results to supplement, not replace, current methods. So, if someone wishes to manipulate the results, they will have to combine several forms of cheating to succeed.
The article also states that methods will be used to prevent this sort of abuse, though Google doesn't say (for obvious reasons -- why do spammers work for them?) what they are.
But there are obvious ways to defeat abuse. One way is to do IP matching, and cull results originating from a single domain. Another would to use only a random representative sampling of votes, rather than every vote, in counting results. Another is simple human oversite (or good AI), looking for unusual ranking changes.
Google's been great so far in avoiding the crapfloods. I doubt if they'd cut their own throats. The fact that they are testing this technology rather than just rolling it out is a good sign. When's the last time you heard of a search engine testing before implementation?
Barely-relevant anecdote:
The year that Excite debuted, I found my own credit card number, expiration date and phone number in their database. By pattern matching I found the same for a couple of dozen other people who had all patronized the same online bookstore (idiots momentarily had their customer database on the webserving machine, excite's spider found it).
It took about a week to find someone at Visa who knew what the Internet was (a security VP). He informed me that Excite had been designed with no means to edit the database. I found that hard to believe -- still do -- but my personal info remained findable for several weeks thereafter.
I survived the Dick Cheney Presidency 7 to 9 AM 7-21-07
Maybe OpenDirectory could add a rate-an-editor feature for their users. If you wanna talk about abuse, look there, not to Google.
This is the kind of idea (put two known good ideas, mix in "internet", boom!) that seems like it would have a bogus patents on it.
As someone who used to use Epinions all the time (making over $1000 from them), I have to say that the epinions "Web of Trust" system seems to work rather well, at least on a small scale (100,000 users).
Basically, you can see what users rated the article as useful. If you think that certain people have similar tastes to you, you put them in your Web of Trust. You'll get articles posted in a different order depending on who you trusted.
It is actually more complicated than that, as there are epinions "Experts" who are judged by epinions to have good ratings. I think Amazon has a similar system (and has way more users, but the system still seems to work ok).
The big problem is that the internet at large has so many bloody users and so many bloody pages... I think introducing groups of users or groups of groups that you trust might be a better way for the Web of Trust idea to work with the internet at large.
. .
Hmm, well, before you posted with a description of "cloaking" I ran a Google search on - web page cloaking - and got this result as the first hit :
Website,web page cloaking and stealth technology
Which is some company trying to make money from doing this. See the next page in their pitch : http://webprominence.co.uk/promotion/costs.htm
No w, if cloaking is easily defeated by a second bot, and the Googlebot has this, shouldn't they at least put _their_ own_ link in top slot just saying, "by the way, this kind of stuff is dishonest, misleading and doesn't work" [and thus possibly a fraud?]
Yeah, sure, no - one's going to do this for the n categories across 1.6bln indexed pages for every possible scam scam, but I'd have thought this kind of thing would be obvious "customer protection / advice" on Google's part. I'd _want_ people to know my search engine couldn't be scammed (at the very least by techniques I could defeat and I was indexing for a possible cheater)
On second thoughts, I'd be just as happy to let the lame would - be tricksters / cloakers whatever waste their money.
On the other hand, isn't it that kind of thought (as I just had) which has left the whole web / Internet up for grabs by the slickest over the dumbest and ultimately hurt the bright people who cared?
Moderation and meta - mod on the scale of the *web*???!!! Man, that'd sounds crazy to me. Slashdot scaled n - fold . . . Can't bear to think about it . . gotta go . . .
However, recently the ad which appears for marijuana changed to NewScientist.com, a science journal which has been publishing much more balanced and thorough information on weed, some of which advocates that weed is less dangerous than alcohol. Also the top result is NORML, a legalization-advocacy group. (This is probably not due to tampering w/ the search engine, but is interesting)
I believe that The Powers That Be within Google have taken the more moderate, academic drug stance, as opposed to gov't-sponsored propaganda. Google's pretty influential, Internet-culture-wise. Food for thought.
(Offtopic, sort of, i know, but I saw a Google story and had to run with it!)
--hongpong.com
The main problem with doing such things over a proxy is the lack of control you have over Google's database. While they can easily filter or sort their search results by any new criteria, you cannot. What you could do is have a central database of search phrases and related rated sites (which could have some semantic intelligence -- still I don't think you would get many hits on anything besides "sex" and "mp3").
This database could then be accsesed by users using UI controls added to the search result pages of major search engines by the proxy, which could be remote or local. As you would do a search over one of the supported search engines, the proxy would also query the db and add the highly rated sites on top of the result list. Trust could be implemented in a similar fashion, but the more complex your system gets, the more awkward the proxy method becomes, especially when Google changes their output format.
My preferred method would be making the rating controls and display independent from any particular site, having a small system-tray application instead that allows you to rate the currently viewed site (also filing it in a certain category -- fast UI is essential here). Then you could "browse recent ratings" and add users who share your tastes to your trusted user list manually, or let the server tell you about users who have rated things similarly to you, which would allow you to "browse recent ratings by friends" or "browse recent ratings by friends in category X". This concept could, of course, be extended to other document types.
The problem is that this GoogleBar only plugs in Internet Explorer, so *nix geeks won't be able to rate sites..
Well, yes and no. There is currently a project on Mozdev that aims to duplicate some if not all of the functionality of the toolbar for Mozilla, and while the current version 0.4 is still somewhat lacking, a new version that duplicates the look as well as the major search functionality (though not pagerank etc) is on the way soon, apparently. However, since this is an independent project and not affiliated with Google, I'm not sure if it would be able to access the rating system. Still, Mozilla users DO have the toolbar, and, since mozilla is cross-platform...
1) Get anyone who wants to rate sites to make an account. Yes, it's a pain, but that way you can track people's rating activities, like on /.
/.)
2) Use the Yahoo! style system of having an image that you have to type the word in from to create an account. Keep changing the way the image is formed. This should *help* to prevent account creation spam.
3) Give people a certain number of points per day / week / month (ala
4) Make it so that everyone has to balance out +ves and -ves - that is, somehow make sure that they can't just do one or the other.
5) Make it so that each account can only rate a particular site once. Now this requires quite a bit of storage, because you've got to store every rating ever individually instead of just a counter, but that way you can prevent multiple rating on some corporate site.
Note that this prevents the idea of rating a site based on how appropriate it is for a particular search, which is admittedly one of the really exciting parts of this (that is, if I search for Transistors and get www.electronics.com then I rate it 'Good'. If I search for Open Source and get www.electronics.com then I rate it as 'Bad'.)
With this system instead of this I just rate www.electronics.com according to how good the site is, not how relevant it is. Maybe that's what they're aiming for, maybe it's not.
I think that would help stop it but it all depends on the security of the account creation process - if it's easy to spam then the whole system becomes a waste of time.
It also doesn't prevent the problem of people being paid for ratings, which is possible, or for a company getting every single one of its employees to vote for the company. Thinking about that, one solution could be to just say that a company's rating can't go above a certain level and can only increase at a certain speed.
Or you could have metamoderation. This sounds more and more like Slash based code all the time!
' Ore stabit fortis a fine placet ore stat '
- found on a park bench
No Slashdot poster has been able to reliably get their posts modded to +5 yet.
Well, their system doesn't have binary failure. It's really good right now, but it could definitely be better. There are (admittedly rare) times when I feel like my searches aren't handled as well as they could be. This is just one possible way they could improve their relevance.
There are no trails. There are no trees out here.
If Google (or another search engine) set up all links to visit an internal google page that quickly redirected the user to the target site, it could rate on how many people visited the site, instead of a potentially biased rating of users.
Of course, shady websites could still influence it, either by hitting the pages themselves, or by crafting their page so that the google-selected text is tempting to search engine users, but the system still has the advantage of not requiring active participation of users.
Just my $.02
For instance, check out the captcha project [captcha.net] at CMU.
I looked at captcha and found that it may generate problems with disability legislation in some jurisdictions. For instance:
The only accessible test (fbw) doesn't always work, and the other three are not accessible to those with disabilities. Watch somebody get sued under the ADA.
Will I retire or break 10K?