Webmasters Pounce On Wiki Sandboxes
Yacoubean writes "Wiki sandboxes are normally used to learn the syntax of wiki posts. But
webmasters may soon deluge these handy tools with links back to their site, not to get clicks, but to increase Google page rank. One such webmaster recently demonstrated this successfully. Isn't it time for Google finally to put some work into refining their results to exclude tricks like this? I know all the bloggers and wiki maintainers would sure appreciate it."
In the real world, there are neighborhood watch signs to "deter" criminals.
Perhaps there could be a command in the robots.txt file which says "Browse my site, but don't count any links here for page ranking"? That would make your site less of a target for spammers, but not prevent you from being ranked at all.
paintball
These seems similar to the system all those porn systems used to get such a high rank in google.
Kind playing the system with the content not being quite as desirable.
Evolution or ID?
Well, couldn't have been that successful, for he didn't win.
It was time to do that at least a year ago. It's pretty much impossible to find good information on any popular consumer product and this is a problem that's been around for a long time.
But they're too busy making an email application with 9 frames and 200k of Javascript to pay attention to the reason people use them in the first place. It's a little disappointing, I'm an AltaVista alumni and I got to watch them forget about search and do a bunch of useless crap instead, then die. I was hoping Google would be different.
This happened on the POPFile Wiki. Eventually I solved it by changing the code of the Wiki itself to have an allowed list of URLs (actually a set of regexps). If someone adds a page which uses a new URL that isn't covered it wont show up when the page is displayed and the user has to email me to get that specific URL added.
It's a bit of an administrative burden, but stopped people messing up our Wiki with irrelevant links to some site in China.
John.
When I do search in the first category, especially for things such as wallpaper, or simpsons audio clips, the sites that usually turn up are the least coherent ones with dozens of ads. I usually have to dig four or five pages to find a relevant one.
The people with these sites are playing hardball. Google wants them on their side, though, because they often display Google text ads.
Right now, my domain of choice is owned by a squatter that says "here are the results for your search" with a bunch of Google text ads. I was going to/may still put a site there that is very interesting, and the name was a key part of it.
I firmly believe that advertisements are the plague of the Internet. I would like to see sites selling their own products to fund themselves. Google doesn't really help in this regard. The text ads are less annoying than banner ads, but only slightly less annoying.
Don't get me wrong, I like Google. It's an invaluable tool when I'm doing research. I would just like to see them come out in full force against squatters.
As my site grows, I'm thinking about adding a mechanism to address those issues: when the user requests a page for the first time, he'll get a session value that says he's a valid visitor to the site. When he submits a comment, he has to have that value, or comments aren't allowed. I don't know how you'd write a script to circumvent that. (If someone can tell me, I'd love to know so I try to prevent it!)
Leave the links, edit the text to read something like "worthless scumbag, scamming git, googlebomb, please die, low quality, boring" - and lock the page.
Wait a minute - a way to spoof Google to get your page ranked better through WiKi? OMFG! Call the internet police, call Dr. Eric E. Schmidt, call out the Google Gorilla goons! I'm sure the good Dr. has a fix like the ones he used at Novell...
The problem with the whole Google model is that it's biased to begin with. If I'm looking for granny-smith apples, chances are an internet chimp they've bought the space with banana's to Google's goons. It becomes obvious when you see a chimp site that is near the top that has no business at the top. To the experienced googler, it's just an annoying fly on the screen and you just move further down.
I'm hoping that Google doesn't get too bogged down in becoming that big Ape like Micro$oft and be a little more proactive in protecting their business property. It's bad enough that they're selling top space to companies willing to pay, but here's hoping they don't slip on their own banana peels.
Management is doing things right; leadership is doing the right things. - Peter F. Drucker
The system was even easier to rig back then. Back in 96ish, I created a web page with the title "Not Sexy Naked Women". Then repeated that phrase several times and then gave a message telling people to click the link below for more Hot Sexy Naked Women which took them to a page that admonished them for looking for such trash. I added a banner ad to the top of both of these pages, submitted them to a search engine and made $500 in a month! Things are better today, but they're still not perfect.
THIS SPACE FOR RENT
posting on Wikis doesn't screw up your own blog.
posts on message boards will be deleted quickly, unless the board is expressly google bombing (as in the current Nigritude Ultramarine 1st placer) / people are stupid
i think the idea is that wikis make it easier in general for your post to stay up and not affect your blog.
And of course there are still sites that list EVERY referer in their logs somewhere on their site, so spammers have been adding their site URLs to their bot's user agent string. It's amazing the lengths these people will go to spam google.
Sure hope they can find a nice, elegant solution to this.
Isn't it time for Google finally to put some work into refining their results to exclude tricks like this?
I take extreme issue with that statement, and I'm surprised noone else has challenged it. Google does in fact put quite a bit of work into making themselves less vulnerable to these kinds of stunts. They even have a link on every results page where you can tell them if you got results you didn't expect, so they can hunt down the cause and refine their algorithm.
The system will never be perfect, and this is the latest issue that has not (yet) been dealt with. Quit your griping.
Secession is the right of all sentient beings.
What about using random image based spam control lik the one yahoo uses on its new mail signup?
So, every time you edit/post comment, you would be presented with an image with a random distorted text, which you will have to type in to be able to edit/post. That should take care of automated systems.
Why not generate an image containing modified text like yahoo and others? Using a little PHP magic, it shouldn't be too hard (see here to get a start).
I'm not even convinced Google's algorithm has a problem. One thing a lot of people don't realize about the page rank algorithm is that your page rank goes down if you have lots of outgoing links that aren't reciprocated with links coming back from the site you linked to. It may be that this technique simply leads to a reduction in the page rank of the sandbox, which, after all, is appropriate, since the sandbox isn't something the the sandbox's owner even wants people to find by Google searching.
Sure it can be abused, but it's not Google's fault; perhaps these areas of abuse (blogs, wikis, etc.) should address the problems from their end.
Yeah, the simplest thing to do would be for the sandbox's owner simply to use the robots.txt file to forbid indexing of the sandbox page. That keeps the rest of the web site's page rank from being adversely affected, deters spammers from abusing the sandbox, and does Google's users a service by not directing them to the sandbox, which they don't want to find.
Spammers aren't stupid -- if I was an Evil Spammer(tm), I'd certainly make sure my script checked the robots.txt and didn't waste time spamming sandboxes that weren't going to be indexed.
Find free books.
With regards to just editing the sandbox which nobody monitors anyway, why not just include a rule to deny adding URLs. There is no conceivable reason to allow a user to add a URL in the sandbox.
And if your thinking "I want to practise adding links with the required syntax", it's not hard. The only thing you need to use the sandbox for beyond learning how other basic syntax works (and you can apply that to links without practising) is structuring.
Spammers are going there because you have a high PR. So cut the PR supply and you in business, http://www.site.com/~url=http://www.link.com and voila - URL rewriting. no more PR for mr spammer.
I thought it was a real-time thing, where the account creation bots passed the image that loaded during the signup process to a porn site and the images were decoded by a real person, and the result passed back to the bot who then signed up for the account.
To avoid the timing problems with porn signons needing to happen concurrent with account signups, the account generation process was actually initiated by a porn signon. It limits your account generation ability, but only to the extent that you have porn traffic.
Did I just imagine this, or does it work that way?