Google Juice
mpawlo writes: "I guess it is time to start using them bookmarks again, since favourite search engine Google seems to be on the verge of Altavista doom and search engine chaos. BBC News reports of Google bombing (often referred to as 'Google juice' by the infamous Crackmonkey subscribers). 'The users have found a way to "bomb" Google to improve the rankings of particular webpages, and ensure a site is near the top of the results for particular search phrases.'
There is also the sport of Google Whacking affecting your search results."
Step 2: "autistic paraplegic donkey porn"
Step 3: I'm feeling lucky
Step 4: Google Whack
Of course, as I'm all of the top three Stephen Turners already, I don't need to do this. :-)
11.0010010000111111011010101000100010000101101000
The users have found a way to "bomb" Google to improve the rankings of particular webpages, and ensure a site is near the top of the results for particular search phrases.
Well, yes, but it's not easy. The article describes several dozen to several hundred bloggers working together to drive a certain word or phrase toward a certain URL. In other words, it takes a large, concerted effort to deceive Google's engine, and this fact alone provides reassurance that Google is working according to plan.
Somewhere else, on this site, Scientology has been accused of using their large network of sites and members to do the same thing, driving searches for "Scientology" and related words to their own sites rather than those of debunkers. Again, this takes a large and concerted effort, which is a virtue of Google rather than a vice.
Is Google on the verge of breaking because such a thing is possible? Of course not. But there are people powering the search engine on the back end, making improvements constantly in response to issues like this. And their cross-linking approach to ranking pages, while not perfect, remains the most reliable way yet found to judge a match's relevance.
If it works correctly 99% of the time, and Google is constantly working on the last 1%, that still makes it better than anything else out there.
before I start using bookmarks as religiously as I had done before... Besides, the Google team seems to respond to new ideas (good or bad) like white blood cells responding to an infection... Companies have been attempting to boost their rankings on Google for years... yet, for the most part, they have been unsuccessful. I doubt seriously that this is by chance...
You can't simply go to www.google.com/bomb and drag a slider to move a URL up the listings. You have to actually have a concentrated effort. They talk about getting a webpage such as Geocities and getting your friends to do the same. It seems to me mass posting to bulletin boards would do the trick, unfortunately. There is even marketing software out there which posts your 'press releases' to hundreds of bulletin boards automatically.
If this really does start to get out of control, Google will adjust their techniques to work around the problem. I hope.
RFC1925
Google has always seemed to be driven by a happy medium of civic duty and profit. Take their text ads - I love them - unobstrusive, get the point across, and NOT in teh main search results - they are clearly marked. So I expect that the geniuses @ Google will attack this problem and come up with a solution. SO yelling about Google's demise seems VERY premature.
Top Most Bizarre/Disturbing Error Messages
As you can see, it's not that hard to spam the web with links to your site. Don't even count automated newsgroup posting, whch all gets indexed because of google groups.
That is to say, give users a free user account which could be used to give input on whats crap and whats not
For the sake of the discussion, let us call the users who are giving input "moderators."
As another poster mentioned, this system opens up a NEW can of worms, as spammers, idiots, and conservatives will use the system to call certain sites "crap", not because they are not relevant, but because they want the sites' listing to go down.
So then people would demand that the "moderators" were overseen, perhaps by a system of "meta-moderators", and you see where I am going with this.
God is real unless declared integer
In addition to other spam prevention methods, google uses complex matrix/vector filtering to ignore link circles. Basically, if (say) the same 100 different sites link to the same set of 20 other sites, and no one else links to them, Google will map them out and realize that they are all working in a concerted effort. That way if a spammer sets up 100 ostensibly independent sites and then links them all to his e-commerce sites, google will realize what he is doing and penalize his rankings for it. The only way that a spammer can 'bomb' google is if he gets a large array of other sites (for instance weblogs) that have significant traffic and link to other, different sites, as well as the ones that the spammer is trying to promote. The long-and-short of it is that a group of bloggers could bomb google with a large effort, but the average spammer would have to set up an incredibly complex web of interwoven pages that garner significant traffic to fool google. Even if large groups of spammers formed a cabal to promote their varied interests, it would likely be discovered by humans working at google. So, I'd put away that violin.
Here is the Corante article.
Imagine you're the patriarch of a clan, and everyone in your clan has a homepage. All of your descendants' home pages have links to your home page, since you're the head dude. Your home page only has one word on it - say it's 'thrombosis'. Since Google bases the relevance of its search results on how many links there are to any page, any search for 'thrombosis' will likely show your home page as the number one search result, because you've got the word on your web page and dozens of links to your home page on other sites.
Once you think about how Google's rankings work, you can easily figure out how to game the system. That's why Dave Winer (token head of all webloggers) is usually the first result of a search on 'Dave'.
As far as googlewhacking is concerned, it's not as easy as it looks. Try 'parrhesia verboten'. I stopped once I found that one, proving to myself that it can be done. :)
This phenomenon is known as a "heisenwhack", after famed theorist Werner Heisenberg. A heisenwhack compensator has been developed, however. Adding the term "-googlewhack" to your search will fairly reliably eliminate these kind of hits.
Mmmm...Google my precious...musn't let the nasty bloggers get it, no, not my Google precious, no...
-- Two men say they're Jesus. One of them must be wrong. - Dire Straits
This does _nothing_ to undermind the relevance of Google's rankings. When you perform a search on Google and the first "hit" is one that has been juiced in this way you are getting a hit that a larger number of individual sites, all of which are respected by other sites, agree is important to the subject. That is the beauty of Google.
Yes, this effect can be choreographed, but the result is the same. All of the sites choreographed to achieve this result are voting that site A is relevant to subject B. If the sites involved consistently show bad judgment their ranking in Google are likely to decline and therefore their contribution to the Google ranking for subject B will lessen.
The fact that a large number of highly ranked blogs can drive a URL up the Google pop-chart is evidence of both the respect blogs are given and the power of Google's algorithms to find such non-corporate backed content.
The value of a search engine lies in its ability to return usable results when you are actually looking for something. Most of the "exploits" people are discussing don't affect Google's usefulness as a search engine. (When is the last time you searched for "talentless hack" or, for that matter, "david gallagher"? Only someone already participating in the prank, or curious about it, would even know it existed.) And "Googlewhacking" is the most harmless of all - the only search results it can "affect" are its own, as listed winning word pairs lose their uniqueness at the next crawl. So what?
Google folks are not stupid. If the integrity of searches that people really make is affected, they will change the code.
In the meantime, is it really necessary to squelch every last bit of fun on the Net?
If you type "Free Porn", then you can whack your google all you want!
A warning to those considering using Google's page ranking service (which tracks your surfing habits, which isn't a problem since it is very upfront about it.) Overall, it works pretty well and it has found several pages of genuine interest to me that I would not have found otherwise. Also, I have no reason to think that they're doing anthing sinister with the information (and I don't care.)
However, since I like slashdot so much (I assume that is why) it's been serving up advertisements for other projects that link to SourceForge whenever I run google searches; for example, the white supremacist publication the Free Occident, which is powered by SourceForge.
Now, I'm not one of those people who thinks Google should try and filter hate speech from search results. Likewise, I don't think that the Free Occident should somehow be prevented from using SourceForge's software - open source means open, Voltaire was right, etc. However, I think google should draw the line at serving advertisements for articles about how "If you hear about a 100-million-dollar swindle, then you know that it has to be a Jew."
I've dumped a copy of the html for the search result in my journal - paste the Extrans into an html file to see it in close-to original format. It appears from the first version in my journal that the ad appears ABOVE the search results - this is not the case.
Free Occident is a web log, but I find it far more worrisome that they've purchased an ad on google than if they were trying to blog some search term, like "White Power," or even "Occident."
Yes, I'm Jewish.
The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
www.office-supplies-st0res.com/ (66.33.85.157)
office-storage.1nf0-office-equip.
pens-pencil.search-office-supplies
buy-furniture.furniture-sh0p-searc
printer-toner.supplies-1nfo-office
office-product.office-supplies-sh0
office-computers.supplies-1nfo-of
calculators.supplies-1nfo-office.
discount-office.supplies-1nfo-off
If you look at the HTML source code (after clicking on one of these results from google.com), you can see it is obviously a deliberate measure to track it's referring URL and search keyword, and logs the results to bizrate.com. Stuff like this makes me furious, especially if you take into account the potential long-term costs. Google's spider has to waste traffic by going through these sites, searchers like me have to skip through a bunch of garbage results, resulting in more traffic. Sure, maybe a few kilobytes of data, but IMO, it contributes to the expenditures of search engines, eventually resulting in more ads, etc... Maybe i'm exaggerating a tad, but it's wasteful to say the least.