Several Link-Spam Architectures Revealed

← Back to Stories (view on slashdot.org)

Several Link-Spam Architectures Revealed

Posted by timothy on Saturday April 24, 2010 @09:23PM from the labyrinthine-luring dept.

workie writes "Using data derived from website infections, RescueTheWeb.org has found several interesting link-spam architectures. One architecture is where concentric layers of hijacked websites are used to increase the page rank and breadth of reach (within search engine search results) of scam sites. The outer layers link to the inner layers, eventually linking to a site that redirects the user to the scam site. Another architecture involves hijacked sites that redirect the user to fake copies of Google, having the appearance that the visitor is still within Google, but in reality they are on a Google lookalike that contains only nefarious links."

38 comments

Min score:

Reason:

Sort:

For the paranoid... by Antony-Kyre · 2010-04-24 21:30 · Score: 5, Interesting

Consider doing all your banking, and any other sensitive stuff, on a computer totally separate from your web-surfing computer. Kind of like having a dummy wallet containing only petty cash and your ID when you go out at night versus your credit cards, etc.
1. Re:For the paranoid... by Anonymous Coward · 2010-04-24 21:42 · Score: 0
  
  better is a Live CD
2. Re:For the paranoid... by SlothDead · 2010-04-24 22:31 · Score: 3, Insightful
  
  When a vulnerabily is found on your LiveCD you won't be able to patch it.
3. Re:For the paranoid... by asdf7890 · 2010-04-24 23:04 · Score: 3, Informative
  
  No, but you can just download a new version of the CD.
4. Re:For the paranoid... by Darkman,+Walkin+Dude · 2010-04-24 23:05 · Score: 1
  
  This doesn't represent an active threat though, its just for those who get fooled by the camouflage of scam sites. And if they get pwned on one computer, they can get pwned on another just as easily.
  
  --
  What he can't kill, he has sex on. Trent.
5. Re:For the paranoid... by gzipped_tar · 2010-04-24 23:14 · Score: 2, Interesting
  
  Ever heard of LiveOS persistent storage?
  
  --
  Colorless green Cthulhu waits dreaming furiously.
6. Re:For the paranoid... by oiron · 2010-04-24 23:25 · Score: 2, Interesting
  
  This is 2010; install it onto a VM...
7. Re:For the paranoid... by Runaway1956 · 2010-04-24 23:28 · Score: 2, Insightful
  
  That isn't paranoia - it's good common sense. Statistics tell us that an ungodly number of computers are compromised. Why do your banking and other sensitive online transactions from a potentially compromised machine? Use those LiveCD's, or a virtual machine, or almost ANYTHING other than your Windows browsing and porn watching machine!!
  
  --
  "Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it." - Charlie Br
8. Re:For the paranoid... by Anonymous Coward · 2010-04-24 23:34 · Score: 0
  
  Statistics tell us that an ungodly number of computers are compromised.
  I'm curious as to what would be a godly number of compromised computers?
9. Re:For the paranoid... by Trepidity · 2010-04-25 00:28 · Score: 2, Funny
  
  exactly three, but that three is, at the same time, only one
  
  --
  10 PRINT CHR$(205.5+RND(1)); : GOTO 10
10. Re:For the paranoid... by gzipped_tar · 2010-04-25 00:34 · Score: 1
  
  It's less than ten but more than one, but it's not nine, eight, seven, six, five, four, three, or two.
  (With apologies to Jorge Luis Borges http://www.christopherculver.com/en/translations/ornithologicum.php)
  
  --
  Colorless green Cthulhu waits dreaming furiously.
11. Re:For the paranoid... by couchslug · 2010-04-25 00:46 · Score: 3, Informative
  
  "When a vulnerabily is found on your LiveCD you won't be able to patch it."
  Slashdotters should know better...
  You can boot from a live Linux CD and remaster it, which is very cool.
  http://www.knoppix.net/wiki/Knoppix_Remastering_Howto
  You can also keep a variety of live OS including custom WinPE versions.
  http://www.911cd.net/forums/
  
  --
  "This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
12. Re:For the paranoid... by Tom9729 · 2010-04-25 01:09 · Score: 1
  
  But that negates the whole point of using a livecd for this in the first place.
13. Re:For the paranoid... by capo_dei_capi · 2010-04-25 01:30 · Score: 1
  
  Seeing that them hijacking your host OS would give them access to your VMs, using a VM for general surfing purposes seems to be better choice, especially if it runs an obscure OS, like opensolaris.
14. Re:For the paranoid... by Nerdfest · 2010-04-25 01:45 · Score: 1
  
  A VM isn't foolproof. There are exploits that will allow you to 'step out of the box'. A lot of the time the VM runs at a high privilege level as well. It's protection, just not as foolproof as a live CD.
15. Re:For the paranoid... by Hurricane78 · 2010-04-25 04:26 · Score: 1
  
  Protip: It’s called FinTS. With chip card. Look it up. :)
  I use it since it were still experimental and called HBCI 1.0.
  No browser involved. You have a separate reader with keys (and optionally a display) that you interact with. Unless someone modifies the reader, there is no way anyone else can get your code. In short it’s two-factor authentication on a trusted client. The PC just shoves encrypted packets back and forth between the reader and the bank server.
  I recommend having a reader with a display. That way it can even guarantee you that the amount displayed is the amount you actually agree to be transferred. Additionally it allows you to load stored-value cards, do legally valid digital signatures, and read a lot of other chip cards.
  
  --
  Any sufficiently advanced intelligence is indistinguishable from stupidity.
16. Re:For the paranoid... by couchslug · 2010-04-25 04:44 · Score: 2, Insightful
  
  This is 2010, run your VM off a live CD!
  http://wiki.xensource.com/xenwiki/LiveCD
  
  --
  "This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
17. Re:For the paranoid... by Anonymous Coward · 2010-04-25 05:13 · Score: 0
  
  This is 2010.. you still have money to bank with?
18. Re:For the paranoid... by lsatenstein · 2010-04-26 15:25 · Score: 1
  
  I have a live CD and boot that when I want to do my banking. Since I also live near a branch of the bank, my wife goes there to do most of the non-electronic transactions, such as extracting grocery money, etc. Why extract money? Well, I don't want to be a victim of a business whose site gets compromised and find there site secutity was or is a copy of the security shortcomings experienced by TJMAXX. I want to own my personal information and not worry about it after it was stolen.
  
  --
  Leslie Satenstein Montreal Quebec Canada
Link Spam? by AndGodSed · 2010-04-24 21:30 · Score: 2, Insightful

I thought that google had ways of detecting these and down-ranking them?

--
Seven Days with Ubuntu Unity
1. Re:Link Spam? by Anonymous Coward · 2010-04-24 21:50 · Score: 0
  
  Google had ways of making you think they had down-ranked the spam links.
2. Re:Link Spam? by FuckingNickName · 2010-04-24 23:03 · Score: 1, Interesting
  
  Precisely. In fact, with Google for Domains etc., they know well how profitable this link spam is. Hell, 10 people employed 8 hours a day flagging sites would tackle the vast majority of repeated and obvious search engine spammers. But then Google would have to admit that they haven't refined interesting algorithms since the '90s, and might have to give actual work to the 2nd rate PhDs they hire to twiddle their thumbs.
3. Re:Link Spam? by asdf7890 · 2010-04-24 23:11 · Score: 3, Insightful
  
  Every time Google adjust the rankings to account for the current crop of deceptive SEO techniques, people think up new deceptive SEO techniques. It is a moving target and Google can't move too fast without thinking as they risk disrupting unaffected parts of the algorithm resulting in reducing its effectiveness when presented with genuine links.
  Also Google may be the biggest name in town but they are not the only big name by a long shot. an SEO technique is not completely invalidated until such time as all popular engines have a away to discount it.
  And the summary (didn't RTFA, sorry) doesn't state that the techniques were proven to be working, just that this is what people are trying.
4. Re:Link Spam? by Anubis+IV · 2010-04-25 08:16 · Score: 1
  
  ...an SEO technique is not completely invalidated until such time as all popular engines have a away to discount it.
  So, basically...as soon as Google changes their algorithm? Because I'm drawing a blank on any other "popular" engines here...
  
  http://www.netmarketshare.com/search-engine-market-share.aspx?qprid=4
muh dick by Anonymous Coward · 2010-04-24 21:36 · Score: 0

revealed!
That was actually an interesting read by bguiz · 2010-04-24 22:19 · Score: 1

While its assertions are believable, I'd now like to see the methods and data
1. Re:That was actually an interesting read by bguiz · 2010-04-24 22:32 · Score: 1, Flamebait
  
  Also, I dislike their main tagline
  
  "The web is under attack from hackers. RescueTheWeb.org is working to reduce their chances of success."
  I take issue with their ignorance toward the difference between a hacker and a cracker. (links to Eric Raymond's "The Jargon File")
2. Re:That was actually an interesting read by Anonymous Coward · 2010-04-24 22:49 · Score: 0, Insightful
  
  The rest of us moved on about 20 years ago -don't you think it's time YOU did too?
3. Re:That was actually an interesting read by For+a+Free+Internet · 2010-04-24 23:04 · Score: 0
  
  I found a great web site for all fun people who like vriot7liugiy7z! Get the best vriot7liugiy7z free at my web stite! vriot7liugiy7z vriot7liugiy7z vriot7liugiy7z!!!!!! EXCludsibve! The bewsrt!!!!! vriot7liugiy7z!
  
  --
  UNITE with the Campaign for a Free Internet because today, our future begins with tomorrow!
4. Re:That was actually an interesting read by bguiz · 2010-04-24 23:41 · Score: 0, Troll
  
  Not at all.
  When you say "The rest of us", you should say just yourself.
rtfa? by Anonymous Coward · 2010-04-25 00:01 · Score: 0

umm, i would read the fine article, but afraid to click the link..
Link pyramids by kmike · 2010-04-25 00:38 · Score: 1

Sounds familiar: http://seoblackhat.com/2009/07/10/link-pyramids/
By the way, if blackhat SEO's describe this technique in the open, it's either already well known, or its effectiveness has been diminished to the point where hiding the details isn't worth it.
1. Re:Link pyramids by workie · 2010-04-25 03:19 · Score: 1
  
  The RescueTheWeb article is a high level discussion of link architectures that currently exist in the wild. The article wasn't trying to show samples since disclosure of which websites are breached is against the privacy policy of RescueTheWeb. These are private websites that have been breached by others and used to create these various structures. Thus, their web addresses would revel who's website were breached. I can tell you that an example 'constellation' Google look-alike search engine consists of some 26 domains of this pattern: http://googpill_.com/ where the '_' is the letters from 'a' to 'z'. When you visit these sites directly they say 'Under Construction', but when you visit them from a hacked site you get the Google look-alike. (Not all of the lettered domains appear to be working.) Follow this link to see an example: http://googpillc.com/zgyllgiaahkeiryy_idknxqkbi.py This constellation example is different than the pyramid example from the seoblackhat. The goal of this constellation, as an example, is to confuse the user into thinking they are on Google, it is not to increase page rank.
2. Re:Link pyramids by TaoPhoenix · 2010-04-25 04:26 · Score: 1
  
  I had basically known it, but it's still daunting to face as an actual search customer.
  I like trying out freeware utilities. But sometimes it's tricky to know which are real links (could be some 15 real ones) and which are nastylinks (could be 85) for my 100-result first page of returns.
  
  --
  My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
The problem: low standards in search engines. by Animats · 2010-04-25 05:21 · Score: 1

These guys are doing good work, but really, all they're doing is checking for some specific types of black-hat SEO. This is inherently a losing battle, because there's active opposition. It's a "negative file" approach - making a list of the bad guys. Credit cards once worked that way; merchants were sent daily lists of canceled or stolen credit cards. Back then, getting a credit card was tough; the customer had to be a good customer of the bank. Not until credit card transactions were validated remotely against a "positive file" that checked the actual account could everyone have one. Web search is still in the "negative file" era.
As I point out occasionally, the main search engines have very low standards for business legitimacy. It's an ongoing, and losing, battle to filter out the totally bogus sites. But if you insist on some minimal standard of business legitimacy for a commercial web site, you kick out most of the "bottom feeders" with no business address, and along with them, most of the total phonies. We do this at SiteTruth, which exists to demonstrate that it's possible. SiteTruth tries to find some indication that a domain maps to a real-world business. If it can't, the site is moved down in search engine position. That's enough to move most "bottom feeder" downward, below the legit ones. It's not always successful in finding the business behind the site, but it looks harder than the average user would, looking through the site's "About", "Help", "Contact", etc. pages for a mailing address. If a search engine takes a hard line on this, the junk sites can be kicked out.
Once you have a business address for a web site, there are extensive resources for finding out more about the business. It's easy to get annual sales and number of employees if you know what database to buy. Corporate registration information and D/B/A name information is available. Business credit rating info is available in bulk for a fee. Crank that info into search engine positioning and you've got hard data driving search. Rating web sites by looking only at the web is a process easy to manipulate. Use info from the real world, and it's much harder.
Phony mailing addresses do show up, but that's usually associated with phishing sites. Not showing a business address is a misdemeanor in some jurisdictions, but common. Using the address of another business is felony fraud and identity theft. That gets law enforcement attention. So only outright criminals try that. To catch that, we fetch the entire PhishTank database every few hours and blacklist the entire domain for a single phishing entry. That's draconian, but if you're running a site that lets users upload entire pages, it's your job to kick the phishers off. Most of the innocent victims there are free hosting services with weak abuse departments. If you're in the free hosting business or the URL redirection business, you need a strong abuse department, or you will be pwned. Right now, "t35.com" is getting hit hard. By now, most free hosting sites with a clue automatically check PhishTank and the APWG list to see if they're on it. "t35.com" is still doing it by hand, and they're losing the battle.
So why doesn't Google do this? Google's business model depends on those ad-heavy "bottom feeder" sites. About 36% of Google's "content network" domains are "bottom feeders". When organic search takes you to the right place on the first try, Google doesn't make any money. But if you're led through an ad-heavy site, the Google cash register clicks. Google's business model thus takes them to the dark side. Google would take a big financial hit if they did even some basic legitimacy checking on their advertisers. Search Google for "craigslist auto posting tool", which brings up five Google ads for companies offering to spam Craigs
1. Re:The problem: low standards in search engines. by the_womble · 2010-04-25 16:25 · Score: 1
  
  This only works if someone is searching for a business or product. Most searches are for information. There are LOTS of valuable websites run by individuals. You rank them all low?
  Why on earth do we want rankings to reflect credit ratings? You can trust sources with good credit ratings more? Lots of businesses with good credit ratings one year, have ended up with their CEO in the dock the next (e.g. Enron).
  You need a lot more data coverage than you have: you can cannot verify Glaxosmithkline, Vodafone (main corporate site - country sites you do), Freshfields (a major law firm) or Oxfam International (but you can verify Oxfam UK).
  Nice idea, but your current implementation sucks (yes, it is alpha, but its not very encouraging). It is better than Cuil.
2. Re:The problem: low standards in search engines. by Animats · 2010-04-26 07:18 · Score: 1
  Re SiteTruth complaints: (We have a blog for that.)
  Non-commercial web sites aren't rated at all. However, the presence of an ad link marks a site as "commercial", as does being in ".com". Our "commercial intent" detection is rather simplistic. We really should have a classifier system doing that. Yahoo search R&D, back when they had search R&D, built one of those, but never did much with it. We've been reluctant to use machine learning techniques, though, because they reduce the transparency of the system. At present, SiteTruth doesn't rely on "security by obscurity". Adding a classifier system would change that.
  Credit rating information is useful because, for businesses, you can get business size information. Annual sales and number of employees are worth knowing, and displaying to the user in search results. (We'll be doing something in that area soon.) There's a guy in Brooklyn, NY, who took pictures of camera stores that advertise on line or for mail order. There are companies with giant warehouses and loading docks, and there are, well, "marginal locations". It's very funny. Search engines need info like that.
  As for specific sites:
  
  Glaxosmithkline: We give them a yellow "?", which means we think they're legit, but don't have third-party verification that the domain is tied to the company. In our hard-ass view, that's an OK rating. SSL certs and BBBonline links provide such third-party verification. They did match our database. We weren't able to parse "Registered office: 980 Great West Road, Brentford, Middlesex, TW8 9GS, United Kingdom.", unfortunately; we only recognize multi-line postal addresses, usable on an envelope, at present.
  
  Vodaphone All the country sites have SSL certs, but the main ".com" site does not. It does have the address "Vodafone Group Plc / Vodafone House / The Connection / Newbury / Berkshire / RG14 2FN / England" on multiple lines, which we pulled out of the source HTML as a possible address, but did not parse successfully. Still, they got a yellow "?", and were matched to the UK business database.
  
  Oxfam gets a green checkmark, and the system was able to pull four business addresses from their web site.