Google Juice
mpawlo writes: "I guess it is time to start using them bookmarks again, since favourite search engine Google seems to be on the verge of Altavista doom and search engine chaos. BBC News reports of Google bombing (often referred to as 'Google juice' by the infamous Crackmonkey subscribers). 'The users have found a way to "bomb" Google to improve the rankings of particular webpages, and ensure a site is near the top of the results for particular search phrases.'
There is also the sport of Google Whacking affecting your search results."
Step 2: "autistic paraplegic donkey porn"
Step 3: I'm feeling lucky
Step 4: Google Whack
I sent Google a link about sex and Javascript. I was searching for Javascript debuggers and got something ELSE. Here is a link to the old picture. http://www.devspace.com/Articles/Article_2002_01_2 1.html
However I think they are starting to do something since doing this search again yields proper results.
"You can't make a race horse of a pig"
"No," said Samuel, "but you can make very fast pig"
Of course, as I'm all of the top three Stephen Turners already, I don't need to do this. :-)
11.0010010000111111011010101000100010000101101000
The users have found a way to "bomb" Google to improve the rankings of particular webpages, and ensure a site is near the top of the results for particular search phrases.
Well, yes, but it's not easy. The article describes several dozen to several hundred bloggers working together to drive a certain word or phrase toward a certain URL. In other words, it takes a large, concerted effort to deceive Google's engine, and this fact alone provides reassurance that Google is working according to plan.
Somewhere else, on this site, Scientology has been accused of using their large network of sites and members to do the same thing, driving searches for "Scientology" and related words to their own sites rather than those of debunkers. Again, this takes a large and concerted effort, which is a virtue of Google rather than a vice.
Is Google on the verge of breaking because such a thing is possible? Of course not. But there are people powering the search engine on the back end, making improvements constantly in response to issues like this. And their cross-linking approach to ranking pages, while not perfect, remains the most reliable way yet found to judge a match's relevance.
If it works correctly 99% of the time, and Google is constantly working on the last 1%, that still makes it better than anything else out there.
What they are reporting as a problem may not be. Google is raising sites in the rankings if large numbers of bloggers link to them--but they only do that if they like the link for some reason. What we have are lots of individuals (who many people respect at least enough to read occasionally) all saying, in effect, I find this interesting, and you might too.
We don't have some advertising hack sitting behind a desk on Madison Ave. saying "Make it so" and pushing a site to the top of Google. The only ways X-10 or mulesex.com or whatever could benifit from this are 1) as a joke, or 2) because they posted something that a wide variety of people liked.
This is how Google is supposed to work. So, where's the problem?
-- MarkusQ
before I start using bookmarks as religiously as I had done before... Besides, the Google team seems to respond to new ideas (good or bad) like white blood cells responding to an infection... Companies have been attempting to boost their rankings on Google for years... yet, for the most part, they have been unsuccessful. I doubt seriously that this is by chance...
You can't simply go to www.google.com/bomb and drag a slider to move a URL up the listings. You have to actually have a concentrated effort. They talk about getting a webpage such as Geocities and getting your friends to do the same. It seems to me mass posting to bulletin boards would do the trick, unfortunately. There is even marketing software out there which posts your 'press releases' to hundreds of bulletin boards automatically.
If this really does start to get out of control, Google will adjust their techniques to work around the problem. I hope.
If "That" ever does become a sport, I'll be like a superstar and shit.
It hurts when I pee.
Perhaps the best solution, if things get too far out of hand, is to use the input of people who would be pissed off about crappy listings. That is to say, give users a free user account which could be used to give input on whats crap and whats not, then the Google admins could simply remove all the crap that rose to the top because enough users clicked a link that said, "This is crap!" Using this in conjunction with google's already strong engine would probably solve any problems, imho.
RFC1925
Google has always seemed to be driven by a happy medium of civic duty and profit. Take their text ads - I love them - unobstrusive, get the point across, and NOT in teh main search results - they are clearly marked. So I expect that the geniuses @ Google will attack this problem and come up with a solution. SO yelling about Google's demise seems VERY premature.
Top Most Bizarre/Disturbing Error Messages
As you can see, it's not that hard to spam the web with links to your site. Don't even count automated newsgroup posting, whch all gets indexed because of google groups.
In addition to other spam prevention methods, google uses complex matrix/vector filtering to ignore link circles. Basically, if (say) the same 100 different sites link to the same set of 20 other sites, and no one else links to them, Google will map them out and realize that they are all working in a concerted effort. That way if a spammer sets up 100 ostensibly independent sites and then links them all to his e-commerce sites, google will realize what he is doing and penalize his rankings for it. The only way that a spammer can 'bomb' google is if he gets a large array of other sites (for instance weblogs) that have significant traffic and link to other, different sites, as well as the ones that the spammer is trying to promote. The long-and-short of it is that a group of bloggers could bomb google with a large effort, but the average spammer would have to set up an incredibly complex web of interwoven pages that garner significant traffic to fool google. Even if large groups of spammers formed a cabal to promote their varied interests, it would likely be discovered by humans working at google. So, I'd put away that violin.
From my own experience, a properly worded search + feeling lucky is about 90% accurate in finding what I'm looking for.
Taken from: http://www.google.com/technology/index.html
PageRank Explained
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."
Important, high-quality sites receive a higher PageRank, which Google remembers each time it conducts a search. Of course, important pages mean nothing to you if they don't match your query. So, Google combines PageRank with sophisticated text-matching techniques to find pages that are both important and relevant to your search. Google goes far beyond the number of times a term appears on a page and examines all aspects of the page's content (and the content of the pages linking to it) to determine if it's a good match for your query.
What?
Here is the Corante article.
who would have guessed that immolated polyp would only yeild one page... oops... sorry...
Imagine you're the patriarch of a clan, and everyone in your clan has a homepage. All of your descendants' home pages have links to your home page, since you're the head dude. Your home page only has one word on it - say it's 'thrombosis'. Since Google bases the relevance of its search results on how many links there are to any page, any search for 'thrombosis' will likely show your home page as the number one search result, because you've got the word on your web page and dozens of links to your home page on other sites.
Once you think about how Google's rankings work, you can easily figure out how to game the system. That's why Dave Winer (token head of all webloggers) is usually the first result of a search on 'Dave'.
As far as googlewhacking is concerned, it's not as easy as it looks. Try 'parrhesia verboten'. I stopped once I found that one, proving to myself that it can be done. :)
This phenomenon is known as a "heisenwhack", after famed theorist Werner Heisenberg. A heisenwhack compensator has been developed, however. Adding the term "-googlewhack" to your search will fairly reliably eliminate these kind of hits.
You have most likely inadvertently taken advantage of Slashdot to boost yourself up in the rankings. Merely being an active commentor puts your homepage link all over ... And loads of people link to slashdot. It isn't on the same scale as the blog tactic in the story, but it still can jack a "Matt Burke" (or any other non-famous name) to the top in about 50 posts.
Mmmm...Google my precious...musn't let the nasty bloggers get it, no, not my Google precious, no...
-- Two men say they're Jesus. One of them must be wrong. - Dire Straits
There will *always* be a cycical contest between hackers and security, and search engine spammers and opitmizers are no exception.
It should be emphasized that these spamming vulnerabilities of search engines are almost entirely due to their automated nature. Efforts to present search results not just based on author-presented data, such as the frequency, positioning, and proximity of search terms, but with also somehow computing more objective data based on the source domain of the indexed file, how often searchers choose the link, and especially a sophisticated type of citation analysis that charts authoritative pages and hubs by counting the number of links pointing to a page, do hold promise for offering more relevant search results (Brin & Page, 1998; Chakrabarti, et. al., 1999; Notess, 1999). It is reasonable to assume, however, that no matter how sophisticated the spamming countermeasures adopted by automated indexes become, new ways of fooling the machines could be crafted. Some amount of human editorial power therefore seems necessary.
- From a paper I wrote back when Google seemed impervious to spamming (early 1999).
Given just the example regarding the redirection of "talentless hack" to the guy's friends site clearly demonstrates that this is an abuse and degrades the value of Google as a search engine, versus being some sort of great democratic benefit. When I use Google to find search results, I'm looking based on content and relevance, not "How many online friends got together and Google bombed". Online, with manipulable systems like that, democracy doesn't work, and that was the whole problem with META tags which this is basically recreating. Even worse is that it doesn't even just have to be democracy: Many Blogger sites themselves have high rankings as a whole, and with some machination someone can individually set up thousands of sites and programmatically set-up Google bombs. Clearly Google will have to filter this out.
Google is like scientific measurements : If the process is affected by the measurement then it's tainted.
This does _nothing_ to undermind the relevance of Google's rankings. When you perform a search on Google and the first "hit" is one that has been juiced in this way you are getting a hit that a larger number of individual sites, all of which are respected by other sites, agree is important to the subject. That is the beauty of Google.
Yes, this effect can be choreographed, but the result is the same. All of the sites choreographed to achieve this result are voting that site A is relevant to subject B. If the sites involved consistently show bad judgment their ranking in Google are likely to decline and therefore their contribution to the Google ranking for subject B will lessen.
The fact that a large number of highly ranked blogs can drive a URL up the Google pop-chart is evidence of both the respect blogs are given and the power of Google's algorithms to find such non-corporate backed content.
The value of a search engine lies in its ability to return usable results when you are actually looking for something. Most of the "exploits" people are discussing don't affect Google's usefulness as a search engine. (When is the last time you searched for "talentless hack" or, for that matter, "david gallagher"? Only someone already participating in the prank, or curious about it, would even know it existed.) And "Googlewhacking" is the most harmless of all - the only search results it can "affect" are its own, as listed winning word pairs lose their uniqueness at the next crawl. So what?
Google folks are not stupid. If the integrity of searches that people really make is affected, they will change the code.
In the meantime, is it really necessary to squelch every last bit of fun on the Net?
But it fails to mention the "dumb motherfucker" -> George Bush search hit perpetrated by the Hugh Disk site. It helped expose the potential flaw in Google's ranking algorithm.
I'm a bit surprised that when people picked up on this six months later it's considered clever and original.
Java is the blue pill
Choose the red pill
If you type "Free Porn", then you can whack your google all you want!
A warning to those considering using Google's page ranking service (which tracks your surfing habits, which isn't a problem since it is very upfront about it.) Overall, it works pretty well and it has found several pages of genuine interest to me that I would not have found otherwise. Also, I have no reason to think that they're doing anthing sinister with the information (and I don't care.)
However, since I like slashdot so much (I assume that is why) it's been serving up advertisements for other projects that link to SourceForge whenever I run google searches; for example, the white supremacist publication the Free Occident, which is powered by SourceForge.
Now, I'm not one of those people who thinks Google should try and filter hate speech from search results. Likewise, I don't think that the Free Occident should somehow be prevented from using SourceForge's software - open source means open, Voltaire was right, etc. However, I think google should draw the line at serving advertisements for articles about how "If you hear about a 100-million-dollar swindle, then you know that it has to be a Jew."
I've dumped a copy of the html for the search result in my journal - paste the Extrans into an html file to see it in close-to original format. It appears from the first version in my journal that the ad appears ABOVE the search results - this is not the case.
Free Occident is a web log, but I find it far more worrisome that they've purchased an ad on google than if they were trying to blog some search term, like "White Power," or even "Occident."
Yes, I'm Jewish.
The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
www.office-supplies-st0res.com/ (66.33.85.157)
office-storage.1nf0-office-equip.
pens-pencil.search-office-supplies
buy-furniture.furniture-sh0p-searc
printer-toner.supplies-1nfo-office
office-product.office-supplies-sh0
office-computers.supplies-1nfo-of
calculators.supplies-1nfo-office.
discount-office.supplies-1nfo-off
If you look at the HTML source code (after clicking on one of these results from google.com), you can see it is obviously a deliberate measure to track it's referring URL and search keyword, and logs the results to bizrate.com. Stuff like this makes me furious, especially if you take into account the potential long-term costs. Google's spider has to waste traffic by going through these sites, searchers like me have to skip through a bunch of garbage results, resulting in more traffic. Sure, maybe a few kilobytes of data, but IMO, it contributes to the expenditures of search engines, eventually resulting in more ads, etc... Maybe i'm exaggerating a tad, but it's wasteful to say the least.
1) Google is the company with the highest number of Phd graduates. I'm sure they can find an algorithm to cancel out this affect
2) Whenever you do a search, unless it is very specific, you automatically know not to trust the first couple results. It's a fact with all search engines. What makes google even better is that it shows you the text that links to it. So you can tell if it is a relevant link or not.
_______________________________
"I'm not Conceited...I'm just a realist..."
- First, decide what kind of difficulty level you want, eq, pick a number from 2 - 10.
- Open your browser, and do brain.randomize();
- Pick N (where n stands for difficulty level) amount of characters from your brain.
- enter those characters with www and com concattenated to the beginning and to the end
- Hit enter
- And be amazed!!
On could also decide number of times to repeat this process and ++ each time a site is found and play the game with office mates so everybody will have a good timeyush
Since googlewhacking requires that you find just one page on the web that has two English words:
1. Obtain dictionary in electronic form.
2. Separate the words from the definitions
3. Publish to web page
4. Publish to another web page
5. If feeling particularly cruel, publish to additional web pages.
6. Wait for hate mail
Has anyone tried creating whack Chains, where searches on
word_1, word_2
word_2, word_3
...
word_n-1, word_n
will each return a single match?
Then create whack Cycles which would consist of
word_1, word_2
word_2, word_3
...
word_n, word_1
Finally, whack Sets where choosing any two words from a pool would result in a whack?
The goal of each of these would be to make them as large as possible.
As I invented the scoring scheme that helped this craze take off a couple of months ago(multiply the number of hits for each individual word), I would like to point out that it is a game, and not going to affect anyone's search results, as when you post the found GoogleWhack, all you are doing is making that odd combination one unit more popular.
My 'Pocket GoogleWhacker' tool is still available though (yes, there is a Linux version, but I haven't tested it as I don't have a Linux box). Also note that the highest scoring googlewhack by this method often use 'linux' as one fo the search terms