Spam Sites Infesting Google Search Results
The Google Watchdog blog is reporting that "Spam and virus sites infesting the Google SERPs in several categories" and speculates, ...Google's own index has been hacked. The circumvention of a guideline normally picked up by the Googlebot quickly is worrisome. The fact that none of the sites have real content and don't appear to even be hosted anywhere is even more scary. How did millions of sites get indexed if they don't exist?
For years Yahoo was infested with spammers on their front page, but the fact is -- Google is susceptible to an erosion of moral tenacity, just like any other corporation. Someone from within has given the keys to someone who has paid a lot of money to get them. This isn't a hack job... it's an inside job.
The dangers of knowledge trigger emotional distress in human beings.
in conjunction with the saucer people under the supervision of the reverse vampires are forcing our parents to go to bed early in a fiendish plot to eliminate the meal of dinner. We're through the looking glass, here, people...
Hacking of Google databases might explain why Google Translator used to translate the Russian name for "Ivan the Terrible" as "Abraham Lincoln".
Using one page of information for Google's spider and then using a redirect for a non-spider user. It's an SEO tactic.
In my GMail account there where over 60 pieces of spam in a mailbox that has maybe 1 or 2. I wonder if these are related.
Ask not what you can do for your country. Ask what your country did to you
Submitter asks: How did millions of sites get indexed if they don't exist?
Okay, I call this an idiot story. Millions of sites come into being and go out of being all the time. What does this statement have to do with anything? It seems like submitter has a lack of understanding how basic Google and the web work, but the story has made it to Slashdot. I think the Slashdot IQ level is dropping because this is a Digg story.
The article makes the claim that the "hijacked keywords" are going to redirection websites that do not "appear to be hosted anywhere".
:)
:)
That seems a little incredible to me.
Invisible, IPless, Chinese web-servers are taking over Google! Personally, I'll just let Google worry about trying to protect its search engines.
Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
At least the spam sites are more interesting than pages and pages of price comparison crap :-)
If I had an Ass, I'd call it Fanny Bottom, then I could slap my Ass; Fanny Bottom, on the Arse.
Google will adjust, find the method of manipulating the page ranks, and close the hole.
Dominant Meme
I imagine that spammers could band together or simply get botnets 'clicking' as independent IP addresses links that boost their page rank. That's how it worked with Bush, they simply linked his homepage as "miserable failure" and suddenly he was the number one result from that query in Google.
I find this more likely an explanation than someone changing the data or values in the database. There's going to be plenty of evidence left in the logs & it's not like nobody's going to notice. This is Google's bread & butter, no amount of money in the world could entice a worker to mess with it. They would have to be exceptionally stupid as the lawsuits that follow would be in the billions.
My work here is dung.
This should be fairly easy for Google to get around, by re-requesting pages within a short time frame using, say, the IE user-agent string, perhaps from a different IP address. If the pages come up hugely different, toss the page out of the index altogether.
The story would be more interesting if it included an example hijacked search phrase.
I'd like to check it out myself.
Two problems I see are:
- Sites offering one content to Google and another to users. This is indeed something that Google frowns on, but not something that seems to be in place to be tested by the spider.
- Google's fame comes from their PageRank algorithm and unfortunately people now know how to game the results. If Google were to implement multiple algorithms then users could indicate which search type the wish to use. While it certainly makes thing more complicated for Google, it also makes it more complicated for people trying to game the system, since it is harder to know which algorithm to target.
Jumpstart the tartan drive.
People, its just a blog. If someone has really hacked Google, we will hear soon enough. Otherwise scamming and spoofing the ratings with rubbish sites is a sport thats been going on a long, long time..
"A nation that forgets its past is doomed to repeat it." - Churchill
I wonder whether some of the software lets you spam Google's listings easily? Perhaps that's how it was achieved?
TFA suggests that if you want to search actual Chinese sites, you should use google.cn, not google.com.
Erm... no, bad idea. Maybe google.cn won't have the same spam, maybe it will, but it most certainly is censored for other reasons as well. (Unless they've stopped doing this and I've completely missed the news -- there is one tank man on the first page of a google.cn image search for "tiananmen square", compared with almost the entire first page being tank men on google.com.)
And maybe a good suggestion to ignore Chinese sites, for now, but then, why would it work in China, but not here? Seems to me, this tactic would work anywhere, so the only way to be sure you're not infected is to run a secure browser and wait for Googlebot to be updated.
Don't thank God, thank a doctor!
Spam sites had been indexed before the provider learned about spamming and pulled the plug on the sites.
Read radical news here
Quotes:
.cn (Chinese) sites."
.cn sites don't appear to be hosted ANYWHERE." (wow!)
/.? It's a new low, I swear.
"Some searches (very specific phrases, and I won't list any of them right now - Google knows which they are) return results with a large number of
"The
"[...] the Word-Confirm on all of their sites, including the one I will have to use to post this, generate a large number of rogue responses, and the HELPDESK facilities with thousands of consoles and employees each all over the planet watch the responses and other traffic characteristics [...]"
How the HECK did _this_ get on
Its my way of penalising seo'ers. Its worth thinking about
I think he needs to run AdAware. Seriously.. I've entered a bunch of the usual suspects into google trying to find these hordes of .cn sites that pop up. No joy yet.. Anyone else found one?
The old believe everything, the middle-aged suspect everything, the young know everything. - Oscar Wilde
Back in May Google launched on online security blog as part of a broader effort to detect malware sites, presumably to exclude them from the SERP results. They're clearly behind the curve. But this post offers an overview of Google's efforts and ambitions in this area.
RichM
Data Center Knowledge
I imagine they do this. So maybe it is something more sophisticated, but still a variation on the same theme. They could know all of Google's IP ranges or maybe instead of doing that, they know a list of ip's that are definately not google.
Instead of trying to know all of google's ip ranges, and blacklist those, they could just whitelist ip ranges that they know don't belong to google. Because they know they belong to various isp's etc. So the whitelist ip's get the spam page. The unknown ip's get the content pages
Free universal health care
The fact is that most US camera sites online are run from NY/NJ and are fly-by-night garbage that thrive on word of mouth and black-hat SEO. They make money. They spend money in adwords. Google likes money.
Bury me in mashed potatoes.
And sadly simplistic in the extreme to counter for any spammer that has at their disposal thousands upon thousands of throw away domain names. Access logs would show in short order which IP's are visiting those sites. Unless google has a huge IP block that nobody knows about, it's not going to work for more than 5 minutes or so.
I'm scared...
You can't take the sky from me...
As someone pointed out above, they actually do that, but it seems someone has managed to figure out the alternate IP addresses that they use to verify the search engine results and spoofed those as well.
It could be an ex-employee (either fired, quit, or possibly a contractor) who's sold the information to some black hats, or it could be any number of other things. There's money to be made by subverting Google's index, so you have to know that there are people working on ways to do so all the time.
Fanatically anti-fanatical
Amazed this ended up on the front page of slashdot, the article has no "facts" there is nothing other than the wink wink nudge nudge believe me bit here. There is nothing in the article to prove the assertion made here. Let alone the whole thing sounds like someone who is having a hard time with gaming the system, and wants to call conspiracy theory.
The World Wide Web is dying. Soon, we shall have only the Internet.
I'm not seeing any of this. I'm trying commonly spammed phrases in Google, and seeing nothing unusual.
-
"digital camera" - OK
-
"ink cartridge" - OK
-
"flat screen TV" - PCworld at the top
-
"auto parts" - OK
-
"london hotels" - usual results
-
"britney spears" - usual results
-
"viagra" - Pfizer, Wikipedia, etc.
-
"rebelde" (the Mexican telenovela, one of the top ten searches) - normal
Not one...is this really news? It's been going on forever, I hope google isn't just now noticing this.
The other day I noticed a few spam and splog results creeping into the Google Alerts I have set up. I figured someone had made a change now it sounds like a few other changes have been made. I'm not for internet filtering and throttling (especially by ISPs) but I think we are on the verge of a fundamental change in the internet concept. I think we are going to see a good deal of people tired of having to buy a computer, virus software, spy-ware software, spam protectors, firewalls, and other protective software. I think they are going to start turning to their ISPs for protection. They will want the ISP to par down the internet to safe sites. Some ISPs already do the virus protection thing and cut you off if you have a virus. Now before you freak out on me I don't like this idea but the average user could use someone to protect their internet usage. This consumer protection is normal. Grocery stores don't load the shelf with just any garbage they can find, the choose certain items, especially if you shop at a high end or natural food store, they protect you from trans fats. I think consumers are going to want this type of protection from the internet. I also think this will be a problem for Google. They income is based on the idea that any business, person, scam artist, or thing with internet access can buy ads. I think consumers will start to want these ads to be censored and checked before they see them. I don't want to see an add for some fake camera store when I'm googling cameras.
well for those of us whom deal with Google as their lively hood ( I currently run PPC campaigns and do some SEO work on my web sites ), this was a problem.
I spent the better part of a afternoon about 2 weeks ago, submitting my searches to Google asking them too look at these sites.
they were under my key word group and it was driving me nut's.
if you see me, smile and say hello.
Has anyone actually paid attention to the original post date of TFA?
/. advertising.
Thursday, September 20, 2007
Must be nice to hit the jackpot and get some free
Has anyone ever looked into how google-analytics.com (formerly Urchin) works? This blogger http://labnol.blogspot.com/2005/11/prevent-google-analytics-from-tracking.html gives a bit of info--and it does not appear to comply with the Google "do no evil" mantra.
Ignorance is curable, stupid is forever.
Try to search for a driver - any driver! I've run into many pages that require 'registration to download' them. And of course registering costs bucks so its a scam.
Shh.
Worse, I think, is the act of spamming blogs with links. The theory is that, the more links there are pointing to a website, the more popular it must be; so, by using commonly-available, spam-advertised commercial software to pollute blogs with links unrelated to the subject matter, webmasters imagine they can improve their ranking without paying baksheesh to the search engine companies.
I have had an idea for a hack to WordPress, which will make all links invisible to GoogleBot (and maybe the other search engines too). This should make it pointless for anybody to spam blogs with links to their site, since the links won't be picked up by search engines. In a nod to Mel, I call this "Search Engine Pessimisation".
Je fume. Tu fumes. Nous fûmes!
I read the story with interest as something like this happened to me the other day. It didn't even occur to me that Google had been hacked. I figured the original site had been compromised. A hacked web site can be defaced for shits and giggles, obviously, but it could also have a meta refresh tag added to send the browser off to wherever the defacer wants. With the security hole history of most CMS systems out there, I'm surprised that doesn't happen more often.
It looks like Firefox 3 will allow disabling of meta refresh.
The Firefox NoScript extension might be worth considering as well.
Loose lips lose spit.
It certainly is illegal, but this still happens. Even worse, it seems to further this garbage culture on Adwords and cost all of us who participate in it a great deal of money.
If there were any sort of flagging system, all people would do would be to flag competitors. Then we'd be talking about something even worse.
Bury me in mashed potatoes.
I was noticing something similar to this earlier. There were quite a few domain names ending in .cn. Seemed mostly like junk domain names, but were very odd for ending in .cn
One trick that works a lot of the time is to not visit a page that has no link to a Google Cache in the index results.
Most legitimate sites don't put the code to disable Google's Cache option, but most of the spam sites do for some reason...
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
Should I set my user agent to GoogleBot, and perhaps make a plugin that asks for the robots.txt for every page visited?
And if enough people do this, doesn't it help Google too?
Happy moony
I just did an image search and forgot a space. I got a lot of bizarre results, a large number of odd ones come from .hu
I searched on Opel Manta but forgot the space. With it i got many matches very little junk in 1st 10 pages. Without a space i got weird results starting on 1st page. What does a car name have to do with a naked chick with a Nokia phone? Mud wrestlers? Homer Simpson? Paris Hilton? Dozens and dozens of unrelated pictures it seems.
Spyware is off ATM so i didn't get any farther than that.
They should get rid of the Little Red Book and go with this one.
And that dull, red flag is so outdated? Here's a much nicer one.
[End Of Line]
How is this trolling when it's true? In the history of the internet, the adult film industry has been the front-runners... no matter what it is.
The game.
www.qooqle.jp has been around for ages as a Youtube/Google Video downloading site oddly enough. Perhaps they created a wormhole and read your future post? Then again your post didn't exist until somebody read it.
And now I need to go and lie down.
I, too, think that Google wasn't probably hacked,
For the simple reason that it affects other search engines too :
keywords : "Bayesian networks and decision graphs Finn rapidshare"
(as seen on TFA - someone is looking for pirate copies of a book on rapid share, and misstypes the request, forgetting to use "inurl:" or "site:")
Results :
- You guess it, no copies of this book on Rapidshare.... (it would be a copyright violation, even in Switzerland were the website is hosted.
Besides, according to Swiss copyrights law, you are free, as a student, to go into your faculty's library take Finn's book and photocopy the chapters in Finn's book you need, because the universities are paying whatever is needed to make the books publicly available to their patrons)
- Google (.cn only)
- MSN (.cn only)
- Yahoo (not all
- Search.com (not all
All those pages redirect to a page that start downloading an ActiveX installer containing a Trojan (...according to my clamav scan and to http://virusscan.jotti.org/ )
Note that google's pages are subtely different, they feature entries with non-ASCII DNS names.
So two probabilities :
- either google got hacked, and absolutely everybody else are in fact using google's search result instead of having their own database and engine.
- or it's probably another spamdexing attempt, operated by a zombie net.
With a ugly quick script
we see that all those sites point to a couple of machine of some german hosting company.
So perhaps, their server got hacked and subsequently got involved into some spamdexing scheme.
Some one should call them.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Check out the "." before the "cn". A copy paste: "" vs "." I wonder if this has anything to do with it. And that did not work...but the /. preview says . ; for the unicode value.
and commented on by Dvorak. (God, did I just say that he confirmed anything!?!)
http://www.pcmag.com/article2/0,1895,2188281,00.asp
Also, the Reg noticed - after my Slashdot posting, for once - so they are chasing this tail!
http://www.theregister.co.uk/2007/10/01/google_spam_infiltration/
Wheee!
"Flyin' in just a sweet place,
Never been known to fail..."
He said "Abused", Beavis. Self-abuse, get it? Huh, Huh.
"Flyin' in just a sweet place,
Never been known to fail..."
I suspect the Google Desktop app. It is installed in the default build on Dells, HP's, and more, and pushed out by everyone and their dog, like Sun Java patches FFS. This app does many useful things, but I suspect that an HTML spam gets "indexed" along with every other http document you view, and makes it's way in to Google's databases. Witht eh sheer volume of spam, the number of clueless users getting new machines or allowing the app to be forced (along with every piece of popup malware that asks) it is polluting their database. That has been my opinion since I looked at the app the first time.