Several Link-Spam Architectures Revealed
workie writes "Using data derived from website infections, RescueTheWeb.org has found several interesting link-spam architectures. One architecture is where concentric layers of hijacked websites are used to increase the page rank and breadth of reach (within search engine search results) of scam sites. The outer layers link to the inner layers, eventually linking to a site that redirects the user to the scam site. Another architecture involves hijacked sites that redirect the user to fake copies of Google, having the appearance that the visitor is still within Google, but in reality they are on a Google lookalike that contains only nefarious links."
Consider doing all your banking, and any other sensitive stuff, on a computer totally separate from your web-surfing computer. Kind of like having a dummy wallet containing only petty cash and your ID when you go out at night versus your credit cards, etc.
I thought that google had ways of detecting these and down-ranking them?
Seven Days with Ubuntu Unity
revealed!
While its assertions are believable, I'd now like to see the methods and data
umm, i would read the fine article, but afraid to click the link..
Sounds familiar: http://seoblackhat.com/2009/07/10/link-pyramids/
By the way, if blackhat SEO's describe this technique in the open, it's either already well known, or its effectiveness has been diminished to the point where hiding the details isn't worth it.
These guys are doing good work, but really, all they're doing is checking for some specific types of black-hat SEO. This is inherently a losing battle, because there's active opposition. It's a "negative file" approach - making a list of the bad guys. Credit cards once worked that way; merchants were sent daily lists of canceled or stolen credit cards. Back then, getting a credit card was tough; the customer had to be a good customer of the bank. Not until credit card transactions were validated remotely against a "positive file" that checked the actual account could everyone have one. Web search is still in the "negative file" era.
As I point out occasionally, the main search engines have very low standards for business legitimacy. It's an ongoing, and losing, battle to filter out the totally bogus sites. But if you insist on some minimal standard of business legitimacy for a commercial web site, you kick out most of the "bottom feeders" with no business address, and along with them, most of the total phonies. We do this at SiteTruth, which exists to demonstrate that it's possible. SiteTruth tries to find some indication that a domain maps to a real-world business. If it can't, the site is moved down in search engine position. That's enough to move most "bottom feeder" downward, below the legit ones. It's not always successful in finding the business behind the site, but it looks harder than the average user would, looking through the site's "About", "Help", "Contact", etc. pages for a mailing address. If a search engine takes a hard line on this, the junk sites can be kicked out.
Once you have a business address for a web site, there are extensive resources for finding out more about the business. It's easy to get annual sales and number of employees if you know what database to buy. Corporate registration information and D/B/A name information is available. Business credit rating info is available in bulk for a fee. Crank that info into search engine positioning and you've got hard data driving search. Rating web sites by looking only at the web is a process easy to manipulate. Use info from the real world, and it's much harder.
Phony mailing addresses do show up, but that's usually associated with phishing sites. Not showing a business address is a misdemeanor in some jurisdictions, but common. Using the address of another business is felony fraud and identity theft. That gets law enforcement attention. So only outright criminals try that. To catch that, we fetch the entire PhishTank database every few hours and blacklist the entire domain for a single phishing entry. That's draconian, but if you're running a site that lets users upload entire pages, it's your job to kick the phishers off. Most of the innocent victims there are free hosting services with weak abuse departments. If you're in the free hosting business or the URL redirection business, you need a strong abuse department, or you will be pwned. Right now, "t35.com" is getting hit hard. By now, most free hosting sites with a clue automatically check PhishTank and the APWG list to see if they're on it. "t35.com" is still doing it by hand, and they're losing the battle.
So why doesn't Google do this? Google's business model depends on those ad-heavy "bottom feeder" sites. About 36% of Google's "content network" domains are "bottom feeders". When organic search takes you to the right place on the first try, Google doesn't make any money. But if you're led through an ad-heavy site, the Google cash register clicks. Google's business model thus takes them to the dark side. Google would take a big financial hit if they did even some basic legitimacy checking on their advertisers. Search Google for "craigslist auto posting tool", which brings up five Google ads for companies offering to spam Craigs