Spam Sites Infesting Google Search Results
The Google Watchdog blog is reporting that "Spam and virus sites infesting the Google SERPs in several categories" and speculates, ...Google's own index has been hacked. The circumvention of a guideline normally picked up by the Googlebot quickly is worrisome. The fact that none of the sites have real content and don't appear to even be hosted anywhere is even more scary. How did millions of sites get indexed if they don't exist?
For years Yahoo was infested with spammers on their front page, but the fact is -- Google is susceptible to an erosion of moral tenacity, just like any other corporation. Someone from within has given the keys to someone who has paid a lot of money to get them. This isn't a hack job... it's an inside job.
The dangers of knowledge trigger emotional distress in human beings.
in conjunction with the saucer people under the supervision of the reverse vampires are forcing our parents to go to bed early in a fiendish plot to eliminate the meal of dinner. We're through the looking glass, here, people...
Hacking of Google databases might explain why Google Translator used to translate the Russian name for "Ivan the Terrible" as "Abraham Lincoln".
Using one page of information for Google's spider and then using a redirect for a non-spider user. It's an SEO tactic.
You should pick some up.
In my GMail account there where over 60 pieces of spam in a mailbox that has maybe 1 or 2. I wonder if these are related.
Ask not what you can do for your country. Ask what your country did to you
Submitter asks: How did millions of sites get indexed if they don't exist?
Okay, I call this an idiot story. Millions of sites come into being and go out of being all the time. What does this statement have to do with anything? It seems like submitter has a lack of understanding how basic Google and the web work, but the story has made it to Slashdot. I think the Slashdot IQ level is dropping because this is a Digg story.
The article makes the claim that the "hijacked keywords" are going to redirection websites that do not "appear to be hosted anywhere".
:)
:)
That seems a little incredible to me.
Invisible, IPless, Chinese web-servers are taking over Google! Personally, I'll just let Google worry about trying to protect its search engines.
Do not spread "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" over the internet, thank you.
Probably the reason they don't have content is the sites respond differently to requests from googles search engine then to requests from users. It would seem that they recognize googles search engine, either from the user agent or from the ip range, and then respond with content. It seems they get the content by proxying US sites. Which I don't think is anything new it's just being done on a larger scale.
When they served the proxied content to google, they could rewrite links on the fly to point to their own domains. They could basically appear like they mirror the whole internet. When a request comes in from a user, since it isn't a google user agent, it would just send it to their trojan infested site.
At least the spam sites are more interesting than pages and pages of price comparison crap :-)
If I had an Ass, I'd call it Fanny Bottom, then I could slap my Ass; Fanny Bottom, on the Arse.
The sites could show one content to Googlebot and another to normal visitors. Google has to test with a different agent string and if the contents differ, they just have to junk the whole domain. I am sure they already do.
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
Google will adjust, find the method of manipulating the page ranks, and close the hole.
Dominant Meme
I imagine that spammers could band together or simply get botnets 'clicking' as independent IP addresses links that boost their page rank. That's how it worked with Bush, they simply linked his homepage as "miserable failure" and suddenly he was the number one result from that query in Google.
I find this more likely an explanation than someone changing the data or values in the database. There's going to be plenty of evidence left in the logs & it's not like nobody's going to notice. This is Google's bread & butter, no amount of money in the world could entice a worker to mess with it. They would have to be exceptionally stupid as the lawsuits that follow would be in the billions.
My work here is dung.
I was getting pharmaceutical spam that linked to Google; when you clicked the link it relayed you from Google to the pfishing site (No certificate, the site looked completely bogus). I complained about it on their bulletin board, unfortunately Google makes it very difficult to give them feedback on their site.
Please sign petition to restore sanity to our banking system!!!
http://financialpetition.org/
The story would be more interesting if it included an example hijacked search phrase.
I'd like to check it out myself.
Two problems I see are:
- Sites offering one content to Google and another to users. This is indeed something that Google frowns on, but not something that seems to be in place to be tested by the spider.
- Google's fame comes from their PageRank algorithm and unfortunately people now know how to game the results. If Google were to implement multiple algorithms then users could indicate which search type the wish to use. While it certainly makes thing more complicated for Google, it also makes it more complicated for people trying to game the system, since it is harder to know which algorithm to target.
Jumpstart the tartan drive.
People, its just a blog. If someone has really hacked Google, we will hear soon enough. Otherwise scamming and spoofing the ratings with rubbish sites is a sport thats been going on a long, long time..
"A nation that forgets its past is doomed to repeat it." - Churchill
I wonder whether some of the software lets you spam Google's listings easily? Perhaps that's how it was achieved?
TFA suggests that if you want to search actual Chinese sites, you should use google.cn, not google.com.
Erm... no, bad idea. Maybe google.cn won't have the same spam, maybe it will, but it most certainly is censored for other reasons as well. (Unless they've stopped doing this and I've completely missed the news -- there is one tank man on the first page of a google.cn image search for "tiananmen square", compared with almost the entire first page being tank men on google.com.)
And maybe a good suggestion to ignore Chinese sites, for now, but then, why would it work in China, but not here? Seems to me, this tactic would work anywhere, so the only way to be sure you're not infected is to run a secure browser and wait for Googlebot to be updated.
Don't thank God, thank a doctor!
Spam sites had been indexed before the provider learned about spamming and pulled the plug on the sites.
Read radical news here
Quotes:
.cn (Chinese) sites."
.cn sites don't appear to be hosted ANYWHERE." (wow!)
/.? It's a new low, I swear.
"Some searches (very specific phrases, and I won't list any of them right now - Google knows which they are) return results with a large number of
"The
"[...] the Word-Confirm on all of their sites, including the one I will have to use to post this, generate a large number of rogue responses, and the HELPDESK facilities with thousands of consoles and employees each all over the planet watch the responses and other traffic characteristics [...]"
How the HECK did _this_ get on
I don't know how many times in the last year I've been looking for something, only to be taken to a page where none of the search terms even appeared and there was absolutely no content whatever - only advertising.
However, Google doesn't seem to have suffered as much as the other search engines.
-mcgrew (mcgrew.info)
PS- I was going t use a Google search results page with "mcgrew dead technologies" as an illustration of WTF TFA was talking about, but the top three results all are mine; the first two point to my site, the next points to a K5 article I wrote, the last a K5 comment I made, and I haven't neeb to K5 for the last two years or more! So perhaps this is a case of mountain-molehill?
Everytime I search for digital cameras to do price checks, I get a bunch of fraud/spam sites in the Adwords.
Every fucking time.
I would nail Google to the wall for hosting scam/fraud sites if I could.
Job? I don't have time to get a job! Who will sit around and bitch about being broke and unemployed then?
Its my way of penalising seo'ers. Its worth thinking about
I think he needs to run AdAware. Seriously.. I've entered a bunch of the usual suspects into google trying to find these hordes of .cn sites that pop up. No joy yet.. Anyone else found one?
The old believe everything, the middle-aged suspect everything, the young know everything. - Oscar Wilde
Back in May Google launched on online security blog as part of a broader effort to detect malware sites, presumably to exclude them from the SERP results. They're clearly behind the curve. But this post offers an overview of Google's efforts and ambitions in this area.
RichM
Data Center Knowledge
Free universal health care
Well, the last of your quotes isn't from the blog article itself, but from a comment done by an anonymous poster. I'm sure you can find enough examples of much worse crap on Slashdot, especially posted by Anonymous Cowards (myself not included! :-)), thus you shouldn't rate the blog based on that.
Cool. You hooked up a couple of sentences from the blog entry with a quote from an anonymous response. (yes, the AC who wrote the response is a nutcase)
Way to smear the article!
I'm scared...
You can't take the sky from me...
Amazed this ended up on the front page of slashdot, the article has no "facts" there is nothing other than the wink wink nudge nudge believe me bit here. There is nothing in the article to prove the assertion made here. Let alone the whole thing sounds like someone who is having a hard time with gaming the system, and wants to call conspiracy theory.
I'm not seeing any of this. I'm trying commonly spammed phrases in Google, and seeing nothing unusual.
-
"digital camera" - OK
-
"ink cartridge" - OK
-
"flat screen TV" - PCworld at the top
-
"auto parts" - OK
-
"london hotels" - usual results
-
"britney spears" - usual results
-
"viagra" - Pfizer, Wikipedia, etc.
-
"rebelde" (the Mexican telenovela, one of the top ten searches) - normal
Not one...is this really news? It's been going on forever, I hope google isn't just now noticing this.
The other day I noticed a few spam and splog results creeping into the Google Alerts I have set up. I figured someone had made a change now it sounds like a few other changes have been made. I'm not for internet filtering and throttling (especially by ISPs) but I think we are on the verge of a fundamental change in the internet concept. I think we are going to see a good deal of people tired of having to buy a computer, virus software, spy-ware software, spam protectors, firewalls, and other protective software. I think they are going to start turning to their ISPs for protection. They will want the ISP to par down the internet to safe sites. Some ISPs already do the virus protection thing and cut you off if you have a virus. Now before you freak out on me I don't like this idea but the average user could use someone to protect their internet usage. This consumer protection is normal. Grocery stores don't load the shelf with just any garbage they can find, the choose certain items, especially if you shop at a high end or natural food store, they protect you from trans fats. I think consumers are going to want this type of protection from the internet. I also think this will be a problem for Google. They income is based on the idea that any business, person, scam artist, or thing with internet access can buy ads. I think consumers will start to want these ads to be censored and checked before they see them. I don't want to see an add for some fake camera store when I'm googling cameras.
well for those of us whom deal with Google as their lively hood ( I currently run PPC campaigns and do some SEO work on my web sites ), this was a problem.
I spent the better part of a afternoon about 2 weeks ago, submitting my searches to Google asking them too look at these sites.
they were under my key word group and it was driving me nut's.
if you see me, smile and say hello.
Has anyone actually paid attention to the original post date of TFA?
/. advertising.
Thursday, September 20, 2007
Must be nice to hit the jackpot and get some free
Has anyone ever looked into how google-analytics.com (formerly Urchin) works? This blogger http://labnol.blogspot.com/2005/11/prevent-google-analytics-from-tracking.html gives a bit of info--and it does not appear to comply with the Google "do no evil" mantra.
Ignorance is curable, stupid is forever.
Try to search for a driver - any driver! I've run into many pages that require 'registration to download' them. And of course registering costs bucks so its a scam.
Shh.
Worse, I think, is the act of spamming blogs with links. The theory is that, the more links there are pointing to a website, the more popular it must be; so, by using commonly-available, spam-advertised commercial software to pollute blogs with links unrelated to the subject matter, webmasters imagine they can improve their ranking without paying baksheesh to the search engine companies.
I have had an idea for a hack to WordPress, which will make all links invisible to GoogleBot (and maybe the other search engines too). This should make it pointless for anybody to spam blogs with links to their site, since the links won't be picked up by search engines. In a nod to Mel, I call this "Search Engine Pessimisation".
Je fume. Tu fumes. Nous fûmes!
I read the story with interest as something like this happened to me the other day. It didn't even occur to me that Google had been hacked. I figured the original site had been compromised. A hacked web site can be defaced for shits and giggles, obviously, but it could also have a meta refresh tag added to send the browser off to wherever the defacer wants. With the security hole history of most CMS systems out there, I'm surprised that doesn't happen more often.
It looks like Firefox 3 will allow disabling of meta refresh.
The Firefox NoScript extension might be worth considering as well.
Loose lips lose spit.
I was noticing something similar to this earlier. There were quite a few domain names ending in .cn. Seemed mostly like junk domain names, but were very odd for ending in .cn
> It appears that the faked sites are redirecting the Googlebot to a location where content can
> be indexed, while at the same time recognizing normal users and redirecting them to a site
> that includes the malware mentioned earlier. This is an obvious violation of Google's
> guidelines, but the spammers have found ways to circumvent the rule and hide it from the Googlebot.
Huh. What do you know about that? Who'd've thunk Google's guidline would be disobeyed by spammers, given how well anti-handgun laws make criminals think twice before using a handgun in a crime.
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
Should I set my user agent to GoogleBot, and perhaps make a plugin that asks for the robots.txt for every page visited?
And if enough people do this, doesn't it help Google too?
Happy moony
I just did an image search and forgot a space. I got a lot of bizarre results, a large number of odd ones come from .hu
I searched on Opel Manta but forgot the space. With it i got many matches very little junk in 1st 10 pages. Without a space i got weird results starting on 1st page. What does a car name have to do with a naked chick with a Nokia phone? Mud wrestlers? Homer Simpson? Paris Hilton? Dozens and dozens of unrelated pictures it seems.
Spyware is off ATM so i didn't get any farther than that.
They should get rid of the Little Red Book and go with this one.
And that dull, red flag is so outdated? Here's a much nicer one.
[End Of Line]
How is this trolling when it's true? In the history of the internet, the adult film industry has been the front-runners... no matter what it is.
The game.
www.qooqle.jp has been around for ages as a Youtube/Google Video downloading site oddly enough. Perhaps they created a wormhole and read your future post? Then again your post didn't exist until somebody read it.
And now I need to go and lie down.
I, too, think that Google wasn't probably hacked,
For the simple reason that it affects other search engines too :
keywords : "Bayesian networks and decision graphs Finn rapidshare"
(as seen on TFA - someone is looking for pirate copies of a book on rapid share, and misstypes the request, forgetting to use "inurl:" or "site:")
Results :
- You guess it, no copies of this book on Rapidshare.... (it would be a copyright violation, even in Switzerland were the website is hosted.
Besides, according to Swiss copyrights law, you are free, as a student, to go into your faculty's library take Finn's book and photocopy the chapters in Finn's book you need, because the universities are paying whatever is needed to make the books publicly available to their patrons)
- Google (.cn only)
- MSN (.cn only)
- Yahoo (not all
- Search.com (not all
All those pages redirect to a page that start downloading an ActiveX installer containing a Trojan (...according to my clamav scan and to http://virusscan.jotti.org/ )
Note that google's pages are subtely different, they feature entries with non-ASCII DNS names.
So two probabilities :
- either google got hacked, and absolutely everybody else are in fact using google's search result instead of having their own database and engine.
- or it's probably another spamdexing attempt, operated by a zombie net.
With a ugly quick script
we see that all those sites point to a couple of machine of some german hosting company.
So perhaps, their server got hacked and subsequently got involved into some spamdexing scheme.
Some one should call them.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Check out the "." before the "cn". A copy paste: "" vs "." I wonder if this has anything to do with it. And that did not work...but the /. preview says . ; for the unicode value.
and commented on by Dvorak. (God, did I just say that he confirmed anything!?!)
http://www.pcmag.com/article2/0,1895,2188281,00.asp
Also, the Reg noticed - after my Slashdot posting, for once - so they are chasing this tail!
http://www.theregister.co.uk/2007/10/01/google_spam_infiltration/
Wheee!
"Flyin' in just a sweet place,
Never been known to fail..."
He said "Abused", Beavis. Self-abuse, get it? Huh, Huh.
"Flyin' in just a sweet place,
Never been known to fail..."
- Google (.cn only)
I'm surprised your script grepping '*\.cn$' worked.
Look carefully none of them have the domain '.cn,' they are all 'cn' (ie. there is no true period in the search results prior to cn). It octdumps like so '357 274 216 c n ' rather than ' . c n '
Yes, I wish people, who have no idea of when 'whom' is correctly used, would just leave well alone. Noone is going to jump down your throat if you use 'who' across the board, but putting on airs and then getting it badly wrong ... it's simply irksome.
I suspect the Google Desktop app. It is installed in the default build on Dells, HP's, and more, and pushed out by everyone and their dog, like Sun Java patches FFS. This app does many useful things, but I suspect that an HTML spam gets "indexed" along with every other http document you view, and makes it's way in to Google's databases. Witht eh sheer volume of spam, the number of clueless users getting new machines or allowing the app to be forced (along with every piece of popup malware that asks) it is polluting their database. That has been my opinion since I looked at the app the first time.