Modern Day Search Engine Manipulations
An anonymous reader writes "I fondly recall the days of yore when search engines could be manipulated just by sticking thousands of extraneous filler words in the META tags or hidden at the bottom of the page. Nowadays search engines work by more advanced techniques that generally don't fall prey to these simplistic tactics, but it'd be folly to presume them impervious. Does it still happen?"
There have been many articles all over the place about people spoofing google by making tons of pages that link back to the real page they want to boost
Remember, search engines now ask for money and they will make sure your page gets to the top of the list.
It involved selling their soles to a dyslexic, spelling challenged shoe-salesman with horns :-)
in this world...
Yes
Yes, it still happens a lot... there's widespread knowledge of so-called "google bombing".. Google pops up some of its search results based on the content between an A HREF tag, as you can read about here: Google Time Bomb...
Much like security, I think this is the kind of thing that hackers and tinkerers will always find a way to exploit. The question is who can stay ahead in the race?
http://www.babysmasher.com
http://www.openingbands.com
The new status quo for search engines seems to be to charge for submission, as many of them now require you to go through a third-party that charges to add your site to the database. The variation of that (ie yahoo) has 'sponsored' sites in each category that appear at the top of the page. A friend runs a site that uses this 'sponsored' system and I'm told those sponsors bid against each other and whoever has the highest bid appears.. kinda like an EBAY for search engines.
-- Greg
Slashdot, would a spell-checker for posting be too much to ask? It's not rocket science!
Nine times out of ten, when using Google, exactly what I am looking for is in one of the first few links.
I had a boss that was asking me "How do we improve our site on google?"
Answer: Provide actual information instead of some glossy maketrdroid garbage that is so prevalent in webpages today and you wouldn't have to worry about the search engines would you?
What's a praying mantids, and what are these hoardes of google users doing to stop them!?
when we refer to things that changed only a few years ago as "modern day"??
Not very impressive, is it? Although it does have the word cack in it!
As far as I know, search engines only look at text content. So, if you want your site to be indexed, don't put a ton of relevant text in Flash or in images.
When I did a search on reliable web server through Google I found this link from Microsoft Dot-Com Companies Powered by Windows 2000 near the top. If Microsoft did manipulate Google to get higher placement they sure choose a poor page to boost.
"Good things don't end with eum, they end with mania or teria." - H. Simpson
I think that tactic was mostly used by porn/warez/mp3/etc sites to advertise...people would constantly be flooded with links to the Pam and Tommy Lee videos, for instance. Now that file sharing has be come so common, people don't go to those sites anymore. Also, I can't verify this, but it's possible that because of so many people using broadband connections, they can host homegrown ftp sites, as another alternative to the relatively fruitless search of decent illegal sites (though I would imagine pop-up disabling in mozilla would facilitate this).
Don't cross him; don't boss him; he's ridin' and hidin' his pain. Don't fight him; don't spite him; just wait till tomo
why not just do what these guys do and rotate the results so everyone has a chance at the top. Rolist
i met a guy at my lug who runs a consulting company, and all the webpages that they do, there is an "invisible link" put at the bottom that refers back to their site. invisible in the fact that it's the same color as the background... supposedly it's pretty effective as his site.... if only i could remember it... is highly ranked on google.
I write code.
I completely agree with the article that oddball pages on one of the big "free website" sites seem to get premiere billing on Google. Furthermore, the accuracy of Google definitely relates to what's being searched for: I've searched for plenty of things where I've gotten page after page of links that don't even have all of the search terms, much less useful content. I don't expect it to read my mind, but sometimes it isn't completely accurate.
Hummmm,
Google search
In fact, KDENews appears on page one and page to, slashdot appears on apge 2 right after kdenews...
I'd rather be sailing...
Here's one I use all the time.. just follow these easy steps:
Now, watch your Google ranking rise to the top! IT'S THAT EASY! And you'll laugh all the way to the bank!
why not just do what these guys do and rotate your results so everyone has a chance on the top. Rolist
Fantomaster is a good site that talks about advanced placement techniques like cloaking (providing alternate content for spiders that is different than what appears on normal browsers), spider IP addresses, etc.
..no but something else "comes up" if you catch my meaning..
Well, yes, but it's on the fourth results page.
It's hard to be religious when certain people are never incinerated by bolts of lightning.
The relevant bit on one of the Britney Spears pages seems to be:
Which, yeah, seems to be a roundabout bit of Google bombing.
The question is -- how does this help Shavlik? Presumably there aren't that many people searching for Britney Spears content who say, "Oooh, a way to push Windows patches through a network! I want that!" You'd think the Google algorithm would weight links according to their relevance to the search criteria.
What I'm listening to now on Pandora...
I know that whenever I search for Squeek's mom I'll find what I'm looking for at the top of the list....but that's probably because I spelled "squeek" wrong and so no one else uses it on their pages, but that's beside the point.
are all over web pages and many search engines find them. I did a search for Fat Beaver (long story) and found a website that had "FAT" (dumb quotes) in black text on a black background. Googles handy toolbar has the highlight function, boy was I surprised.
From the article:
Adam lobbed the first Google Bomb as a joke, aimed squarely at a friend of his: Andy "Talentless Hack" Pressman. Amazingly, a year later, Adam's Google Bomb demonstrates tremendous staying power, as Andy's website is still the number one search result for "Talentless Hack"
But now, since he was the first one to do google bombing, "talentless hack" turns up one of his pages describing how to do it. The google bomb eventually turned on its maker!
Bad bad people
PS, ignore my ecommerce link above...
From the article:
....
How To Promote Your Own Site
Clearly there is some awareness out there as to how to manipulate the search rankings, and following are a few methods that I think are common:
In no way am I promoting any method that encourages false search rank increases.
Is it just me or is there something just slightly contradictory about these statements?
Sailing over the event horizon
is Blowin' in the wind
Of course it still happens. Just ask some opponents of the Church of Scientology.
There may be some confusion because the Google Toolbar, when viewing a page that hasn't been indexed, tries to "guess" what it's PageRank would be based on the site PageRank... but that's not "real".
If you want to know more about Google, the place to go is the Webmaster World Google forum.
Danny.
I have written over 900 book reviews
While searching for a new diaper bag (the cheap ones only seem to last through 1 kid), I was amazed at how many Google search hits pointed back to eBags. You wouldn't always know it from the URLs, though. Some of the URLs were things like ebags-discount.com, bagsdirect.com, handbags.com, etc., making you think that there were several big bag retailers out there. Others were just plain insane; I remember one that was something like "best-basketball-bags-for-women-athletes.com".
Effectively, they circumvented Google's "site grouping" wherein all hits from one site get clustered under a smaller group. I got fed up with it and resolved not to buy anything from eBags.
But I thought to myself, "maybe they're Scientologists..."
I've heard accusations that Google can be "fixed" by creating lots of phony sites that link to your site. Scientology sites are famous for that. I'm sceptical -- thousands of links from sites nobody visits have less impact than one link from a site everybody visits.
That's absurd. Next you'll be telling us that we can raise our /. karma by writing posts that people actually enjoy reading! PUTTING CRAP ON THE INTERNET IS A FUNDAMENTAL RIGHT!!!
But do a google search for crack/serial/warez.
;p
.de spoofed pr0n pages. Someone figured it out.
For instance. Webcam32 Crack
Yes, I OWN webcam32. So there.
The point is, the first THREE PAGES are
All I want to know is, can we get free passes if we help you out?
of using google, it will give the most relevant results if you use a combination of two words or more.
For example, when I searched for God in google a year ago, it returned a list of results, in which the first result is PHP-Nuke (it has fallen to 2nd now)...
So instead of finding religious enlightenment, I found a really kickass PHP based web portal which I still use till now.
PS: I think the main reason for this is that the default admin login for PHP-Nuke used to be God. That has been deleted since version 5.0 (I think).
Welley Corporation - SLM Scammers
Well this is less so when one accounts for Google's limitations. The biggest of these, in my experience (as someone who works for a site whose google rank directly affects sales) is the fact that Google apparently rarely indexes URLs that contain 3 or more CGI parameters after the "?" character.
e =4 to site.com/product/2/4/something.html, and lo and behold, the next time googlebot came by, those pages were indexed (I had verified that the problem was not that the pages had a low pagerank, but that they were not even being spidered at all).
For example, a search on google for "plaid socks" yields only 1 or 2 sites out of 100 that have 3 or more CGI parameters, when I'm sure there are many sites using very complicated urls (with session IDs, etc). Sure, this is just anecdotal evidence, but as someone whose product catalog was listed by urls that had at least 3 CGI parameters (and sometimes 5 or 6 depending on the referring URL) I can say with 90% confidence that having a "complicated" URL severely hurt us. What I ended up doing recently was using mod_rewrite to change all the listed URLs on our site from site.com/product.cgi?sku=something§ion=2&styl
What does this have to do with Google's relevance? Sure, they are returning relevant results when you search, but if they are arbitrarily not listing a site because its URL structure is too "complex" then there's a ton of possibly relevant content that they're missing. If you're someone who sells plaid socks for $10 less than your nearest competitor but Google isn't indexing your plaid socks page because of URL structure (exactly what was happening to us, except not for plaid socks) then you're really not getting the most relevant results. Which is not to say that what you DO see isn't relevant, it's just that there's possibly MORE relevant stuff that you won't ever see.
Fortunately Google has something in the works to cover this particular situation, but it doesn't really have anything to do with fixing their URL complexity policy.
rooooar
1) Look at the top scoring pages.
2) Look at the source of those pages.
3) Create your own pages patterned on the above.
4) lather, rinse, repeat...it's never hard to figure out what a search engine is looking for. (the hard part is how to not piss it off)
This is not a dream, not a dream...we are transmitting from the year 1-9-9-9.
Remember this story?
h .e ngine.ms.idg/
http://www.cnn.com/TECH/computing/9911/15/searc
Here are some more URLs that might be of interest:
having your NAME come top of the list when you type it as a search ;-) See that Scott Porter bloke? That's me, that is! I also get second place, dunno who that imposter is at number 3 though... grrr... ;-)
Code, Hardware, stuff like that.
"fondly"?s "?
"yore"?
"prey"?
"folly"?
"imperviou
My goodness! I guess Google will not classify this page as English!
While the first story is about having sex with a goat, one must ponder why Google would show it when someone searches for goatse.cx...what's the correlation? Lots of people linked to that site with the text goatse.cx?
It really becomes a question of what kind of market searches to you want to show up in.
Random Searches? File searches? product searches?
What is your market? If you do not know what searches you want to show up in, then how can you push yourself higher in google?
"It is a greater offense to steal men's labor, than their clothes"
Such an unbelivable display of ignorance on energising the synergies while leveraging the brand-awareness among the propesct client base shouldn't go unpunished.
Suppose I really hate pokemon and think digimon really kicks the crap of pokemon. I put up a site that is nothing put praise for digimon, but the metatags are filled with pokemon names and the like. Use the popularity of pokemon to sway engines to view digimon. How fast do you think one would get sued over this?
Comment removed based on user account deletion
From the article, I understand this is some software which monitors visited sites, and then ranks sites according to this.
For those running the Google bar with the page rank display enabled, every site you visit is being reported to Google. I would not be surprised if they used this information to help rank sites also.
I.O.U One Sig.
I always check out SearchEngineForums.com for the latest advice. Ranked #4 for Audi S4 and #1, 3, 8-sorta, and 10 for my name ;)
http://www.s4biturbo.com/
I use google almost exclusively. But what am I missing? I have a site that lists a 10 second Chevy Malibu street sleeper (not my car), and while that page has finally entered google's search engine, other pages such as the main page that lists other merchandise for sale such as computer monitors for sale, antiques for sale including antique violins, collectibles, telephone jacks and equipment, a Dodge Ramcharger , a web server or high school, college computer , Racing Champions 1/24 scale dragsters, funny cars and pro stock die cast cars for sale, and other items, have not entered the google search engine for months.
If it takes 3-4 months for one of my web pages or sites to be listed, how many up-to-date pages or sites am I missing because google wants payment for listings? Has anyone done any studies or research on pages or sites missed by google for months because they prefer to get paid for listings for quick inclusion or they leave you hanging? How many people won't see my site offering free text listings to others due to google's demand for payment?
Spike the pigeon food.
We had an interesting situation with Google. Since the company changed names a while back, two domain names point to the same site (although with two different IP addresses).
Links on Google would show up under one site name, but not the other. Apparently Google does something on the back-end to determine that the contents are identical and assign the listing to one of the domain names (in this case the older one).
Only after feeding all visits to the old domain with a 301 and then sending them along to the new domain name did Google's results update to only indicate the new one.
I *do* actually run porn sites, and stumbled upon getting very good rankings.
It all boils down to everything in moderation.
So you have 'normal' amount of meta-keywords, say about 5-9, and the same effect in the title.
Another one that is debated to work is
http://keyword1.keyword2.com/keyword3
Basically, IMO google trys to limit results to 'real' pages.
What do pigeons like?
Put a META tag containing the follow words:
grain, rice, corn, worms, wheat - worked like a charm. You get the idea.
Just a shameless plug here for the Open Directory Project. Leaving aside occasional occurances of editor-fraud or editor-abuse (which are quickly tracked down by the meta-editors), this is the best way to determine a site's real value.
A human looking at the page to subjectively/objectively determine its value is something that can't be replaced by a spider and an AI program.
URL cloaking, hidden text, keyword tricks, etc... don't matter. =)
-jc
Hire a Linux system administrator, systems engineer,
He says:
I do not have access to Google's page ranking technology, and apart from some partial details on their site, they keep their ranking techniques tight lipped to avoid intentional rank manipulating. As such, everything I say in this article is purely speculative based upon analysis of search results for various terms and phrases
No details? They published the algorithm in 1999! If he looked it up, he would have understood PageRank is a page-to-page relationship (not site-to-site), and avoided the idiotic statement "Is it really a democracy that every page on these megalinked aggregate sites become premiere voices of their topic?".
Apparently, this moron didn't even search Google -- the paper is the third result for a search on "PageRank". Why are we taking search engine advice from some imbecile who doesn't use search engines?
The Google team publishes more inside information than any search engine. There's a whole ODP category for Google research papers.
To put that into perspective, there are some 750 pages dealing with mantids that are linked from Google, and that limit is simply because that's the maximum results that Google will return for a particular search term.
That's not even true. Google will return up to 1000 results in a search. Can this guy even count?
There are a lot of better resources about Google on the Web. Why did Slashdot go with this guy?
Proud to be / Smiley-free / Since Nineteen / Ninety-Three
I wonder why http://www.seas.upenn.edu/~zakharin/Software/zd-en try is the first entry in a Google search that points to my site. It is not actually on ZD-NET nor is it linked heavily from anywhere on my site or outside
There are a number of techniques, some legitimate, some not so legitimate (search engines will ban you if they find out):
.cgi or ? in them, as they are considered database generated and search engines like to avoid them (legit, I guess)
- optimizing for keywords in the title bar and in the text near the top of the page (legit)
- optimizing keywords by increasing their font size in the body of web pages(legit)
- detecting the user agent and crafting special pages if its a search engine hitting you (not legit)
- creating lots of "ghost" sites which have similar content, just different html so that you show up many different times on the search engines (not legit)
- optimizing urls so that they don't have
- tons of others.
However, as has already probably been said quite a few times already, it is very much moving to pay to play.
G'luck.
Tim Shephard
http://www.storepages.net - build your business website today
He GAVE SPECIFIC EXAMPLES of the "page-to-page" relationship being disproven, but here you are, the second tard bait coming to "set the record straight". Do a search on virtually anything, and a good portion of the results will come from aggregate sites: DO SOME TESTS, MORON. Exactly as he clearly stated: Whether the results are because of the site internally linking (inflating each sub-page), in the case of page-to-page linking, or it's site-to-site, THE RESULTS ARE THE SAME: Some random guy's Ford Transmission site becomes #1.
No details? They published the algorithm [stanford.edu] in 1999!
Gee, it's not like something could have changed in 3 years. Regardless, Google will not tell you your PageRank, but instead will obscure it as a bar graph. Why do you think that is? Many people are still unsure what the effects of domain names are in Google rankings, yet clearly they have a profound effect. PageRank is NOT A PUBLISHED ALGORITHM.
That's not even true. Google will return up to 1000 results in a search. Can this guy even count?
I will concur with that, however it depends on the search phrase. I've had several search phrases where Google seems to limit it to 750, and I presume he encountered the same thing.
While Google gives Shavlik extra bonus points for those looking for Britney Spears,
Google doesn't give points for "looking", it gives rank for "linking". Shavlik has a high PageRank because lots of people have linked to it (or Google thinks they've looked to it).
it seems likely that they probably also apply those bonus points for any other search as well : i.e. If Shavlik puts up a page on monkey mating, they'll start off with a very high score due to their Britney Spears earned bonus.
Yes, a page's PageRank affects its ranking in all relevant searches, but the point is that Google shouldn't be showing Shavlik in a search for "Britney Spears" at all, unless it has indexed some text that associates Shavlik with Britney Spears. Google results are a "two-pass" system: Google analyzes text to find which pages are relevant to a search, then uses PageRank to finish ranking them.
In fact, looking at Google's cache of Shavlik's home page, we can see that Google thinks other sites are linking to Shavlik using the words "Britney Spears". That is why Google is associating the site with Britney Spears.
There is some anecdotal evidence that Google's robot (Googlebot) can get confused when IP numbers are reassigned. Googlebot caches IP lookups longer than normal: If an IP number gets reassigned from one domain to another, Google (temporarily?) thinks both domains are the same site, and mixes up their listings. Given that shavlik.com is being confused with a defunct domain, it may have accidentally inherited the IP address of britneyspearsnow.com.
(I have now used the words "Britney Spears" more in this post than I have in any conversation in my life.)
Proud to be / Smiley-free / Since Nineteen / Ninety-Three
You've clearly got some anger issues. You must be trying to follow this technique:
Give yourself some freebies by using the signature line or link to address on discussion boards to point to your own site. Throw your opinion into every discussion regardless of your experience or lack thereof.
All but one of the "whole ODP category" papers is from 1999 or earlier (before Google really hit the bigtime), and hence are thoroughly useless. Secondly, as the other post mentioned your claims that the PageRank algorithm is transparent is absolutely laughable: Either you're a naive little boy, or you're painfully ignorant.
Take some anger pills and get over it. I read the paper as a pretty lighthearted rambling regarding search engines, not as a self-professed expert on Google (as you are apparently trying to cast yourself).
I was wonderng if there was a Kazaa client for Macs, so I did a Google search for "Kazaa Macintosh". The first two hits were for kazaa.metamule.com/kazaa-macintosh and kazaa.metamule.com/kazaa-for-macintosh Both these were gibberish pages ("If you are shopping for kazaa macintosh on the internet, then you had better stop here. Our site is considered among the premier kazaa macintosh locations around...")which immediately refreshed to another site. Obviously these are automaically generated subdomains, pages and text designed to gather search engines. Possibly they have little real effect beyond annoyance, as in this case there are NO real pages on the subject which would displace the bullshit pages.
One form of Google manipulation that recently hit the scene is known as Google bombing--to wit, getting a lot of people to link to a particular site with certain key words. It was done a lot with blogging, as the article indicates: by linking to a certain artist's page using the words "talentless hack," they caused that artist's page to come up first when one typed "talentless hack" into the search engine.
Editor Emeritus and Senior Writer, TeleRead.org
Slashdot definitely needs a Google icon.
Da Blog
Funny + Insightful , because its probably true:)
Here's a model of motherboard I own: MS5129. I was searching for a PDF manual for it (not much luck though).
Check the results. Are there _any_ relevant ones?
Pretty much nope.
If you could be told what you can see or read, then it follows that you could be told what to say or think - BoC
There is some anecdotal evidence that Google's robot (Googlebot) can get confused when IP numbers are reassigned.
Oh yes, this was a real problem for me. I run poetrycontestonline.com and I also registered psychicweb.net in an insane fit, thinking I could capitalize on the Ms. Cleo and John Edward syndrome. I let psychicweb.net expire after pointing it to the same IP address as poetrycontestonline.com as a virtual host. For months after psychicweb.net expired, google thought that poetrycontestonline.com was psychicweb.net. A search for poetrycontestonline.com would yield cache links that had psychicweb.net as the domain name. Also, searches on things like "Free Poetry Contest" would yield links to psychicweb.net and not poetrycontestonline.com, which means that after the domain expired, I was effectively removed from google for almost a year.
I hope they got it fixed now, because this behavior was very annoying. Had my site been more of a real business site, I would have been pretty pissed off about it.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
PageRank is not the only criteria for ranking pages.
Funny, but I read an article that was talking about the ranking of pages on Google, not on the PageRank algorithm. You seem a tad fixated, and you seem to be refuting yourself. The guy was talking about "why does Google order results the way it does" (which is a mystery to even self-proclaimed experts as yourself), not "How does this relate to the PageRank algorithm".
That's not empirical evidence; that's two anecdotes and a flawed premise
Empirical: Relying on or derived from observation or experiment.
Sounds like empirical evidence to me: He noticed a preponderance of Geocities sites and became curious as to why they're there.
If you use the Google toolbar to check the mantid search results, you'll actually see lower-PR sites in the results ahead of the page this guy is fixating on. That is empirical evidence that PageRank isn't everything, and that high-ranking results aren't necessarily the highest-PageRank pages, especially for obscurer search terms.
You seem to be arguing with yourself. This whole discussion is about the obscurity of Google's search result ordering, and how people are taking advantage of it (such as in the URL), and here you come coming to save the day and explain how the PageRank algorithm explains it all away. Oh, but wait, no it doesn't...it's only one of the many mysterious machinations at Google...Pick an argument and stick with it.
especially with oddball searches like "ford transmission"
You'd call that an "oddball search"? Jesus you're an apologist. Here's another "oddball" search : siamese kitten: The second link is to a members.aol.com (5 links in the world to it, 1 being the open directory submission) site, and the third is to a geocities.com site. How about Commander Keen: Of the first page (littered with geocities, angelfire, and hometown.aol.com links), #3 is this page. I suppose I should do more normal searches like...Britney Spears?
Hop off your expert soapbox and quit spamming Slashdot with your link : You're just making a fool of yourself.
1) Make a website
2) ???
3) #1 on Google!
Over time, more and more people will figure out how to game the system. Even though it may be a for-profit company, Google is now an important public service that many people rely on. I hope they keep up the trust the public has put in them. Specifically, they shouldn't get sidetracked in to all sorts of side projects and make sure they care about one thing: hosted search.
If they ever start selling applications to the end user, we'll know they're dead...
Do something Slashdot-worthy, like getting the DMCA overturned, and people will find your site.
My 7-year-old son Andrew has top placement for his name and first and third placement for 'funniest stories', not to mention a Googlewhack for Google horklump
How did he do that? Here's the explanation - far shorter and clearer than that article.
You guys must have allready read that 'still happen?' article.
I have certainly seen some people taking the articles advice here on slashdot: "* Give yourself some freebies by using the signature line or link to address on discussion boards to point to your own site. Throw your opinion into every discussion regardless of your experience or lack thereof."
FoundNews.com - get paid to blog.,
Google seem to be struggling with the amount of crap they have ended up indexing.
:(
So much so, that google places A LOT of weight in having a Open Directory (http://www.dmoz.org) entry for a site.
Google used to be great, but i'm starting to find even the top results can be pretty much bunk, and have to look a few pages into the results to find something really relevant.
Another problem is their AdWords service.
AdWords NEED to be more expensive so that only real companies can afford them, and not some sad act operating a "25 Trillions Hits for $1" service.
I think.
And I have been for years for keywords like
"database administration", "ERP administration",
"sql scripts".
I write content and stuff it in simple crusty
HTML with a Nielsen-Norman rip-off format.
The secret is to just say something real and
forget the snakey tricks.
OK I give up, what the hell is a Britney Spear? I did a search for britney spear weaponry and got this Why is lemon juice made with artificial flavor, and dishwashing ... I think someone is pulling our collective chain here?
He gave two bogus examples. Two searches on oddball terms where AOL and Geocities pages rank well isn't proof that pages gets high ranking from their hosts. I could come up with thousands of instances where AOL and Geocities pages don't appear in the top 10.
Do a search on virtually anything, and a good portion of the results will come from aggregate sites: DO SOME TESTS, MORON.
I've probably done a great deal more tests than you or the guy who wrote that article.
Random Test #1: 24,200 results from AOL for the word "beer", but the top ten results for "beer" include no AOL listings.
Not-As-Random Test #2: 22,100 results from AOL for the word "Ford", but the top 10 results for "Ford" include no AOL listings.
Even-Less-Random Test #3: 3,020 results from AOL for "Britney Spears", but not a single result from AOL in the top 10 results for "Britney Spears" that the original article fixated on.
Is are any of you Anonymous Cowards getting the point? AOL has so many members that there's probably an AOL page for almost any subject you can think of. If Google ranked pages high just for being on AOL, AOL would be topping every search. It doesn't. His theory is disproven.
This guy built a theory on a few examples, and didn't even bother to try to disprove his hypothesis. He didn't even spot the clue in his "Britney Spears" search. He's a nitwit. He doesn't know how to collect evidence and he doesn't understand two key facts about Google:
1) Google does text analysis, too, and it does it before taking PageRank into account. That page that tops the "Ford transmission" search really is entirely about Ford transmissions. It probably scores well in text analyses.
2) Not all searches are equal. If you use combinations ("ford transmission") or obscurer terms ("mantid"), it's easier for a small page to place high, because there's less competition overall.
Gee, it's not like something could have changed in 3 years.
I didn't say nothing has changed in three years. I said there's enough in the published papers to show that pages don't get their PageRank from being on AOL.
Many people are still unsure what the effects of domain names are in Google rankings, yet clearly they have a profound effect.
Hmm, let's see: If I wanted to know the effects of domain names in Google, where would I look? How about the very paper cited in the WebmasterWorld discussion you linked to? It says, among other things:
So, again, the Google team said back in 1998 that keywords in the URLs (and thus, keywords in domain names) are more significant than keywords in plain text. It's amazing what you can learn when you do the background reading.
PageRank is NOT A PUBLISHED ALGORITHM [webmasterworld.com].
You know, typing in all-caps doesn't actually make you smarter. There are actually two papers that have the basic algorithm published in them. It's been tweaked since then (probably, mostly by tweaking the dampening factor on pages flagged as problematic by other algorithms), but it's still the same basic equation.
(While we're at it, if you actually read Webmasterworld on a regular basis (instead of pulling that link out of Google, like I'm sure you did), you would know that not a single sane, professional webmaster in the world believes that AOL sites get the gigantic boost that the original article author claimed. There's been a gigantic amount of research done on this subject by a considerable number of professional search engine optimizers, and every single one disagrees with this nitwit.)
Proud to be / Smiley-free / Since Nineteen / Ninety-Three
Here's how the original article author described Google's ranking:
The Google ranking technique, in a nutshell, is that every link provided to a site is a vote for the site, with the weighting of the vote being determined by the number of votes that the voting site itself has received
Compare that to Google's definition of PageRank:
They're the same explanation. Whether he called it that or not, the article author was writing entirely about PageRank. His theory is all about links (and the incorrect assumption that being on the same domain is a link). His article is about PageRank, and to pretend otherwise is deceptive.
This whole discussion is about the obscurity of Google's search result ordering, and how people are taking advantage of it (such as in the URL), and here you come coming to save the day and explain how the PageRank algorithm explains it all away. Oh, but wait, no it doesn't...it's only one of the many mysterious machinations at Google...Pick an argument and stick with it.
I have one and only one argument. You're just not bright enough to understand it. The argument, again, is: The links to a site are not the only criteria determining a page's placement in search results. It's not even the first criteria! (Go read this paper, and you'll see that Google does text ranking, then factors in PageRank.) That guy's entire theory is based on a flawed assumption.
quit spamming Slashdot with your link
I didn't include a single link in the post you're replying to.
Proud to be / Smiley-free / Since Nineteen / Ninety-Three
I believe you missed these steps:
...
Step 5.
Step 6. Profit!
[insert witty comment here]
<body>
<div style="display: none">
<h1>Stuff for google</h1>
Unless I'm mistaken google doesn't handle style sheets very intelligently.
Just use something like this, or maybe an external style sheet to fool it.
All the stuff in the div tag should be indexed by google but not displayed
in a browser.
</div>
Then put you visible content here.
Maybe they've fixed this by now? Would be a chore to handle external style sheets correctly though...
</body>
</html>
Software patents delenda est.
That's exactly what he said he ended up with as a solution, using the mod_rewrite module.
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
We all know that pigeons are WAY too smart to fall for anything we can throw at them.
-- Wibble
What was the most interesting part of this article was their idea that you should comment about everything talked about on discussion sites, whether you know something about the subject or not.
"...extraneous filler words in the META tags or hidden at the bottom of the page."
I noticed the other day that self styled "King of Usability" and Benny Hill look-a-like, Jakob Nielsen uses this technique on his very own page.
Cheers
PD
Party Time: Excellent
Obscurity.Obscurity. That is one of the problems of google. If i search for my name it should come up with my homepage. The outcome varies each week.
-It disappeared.
-it ranks no 1.
-it ranks no 7.
Why? It's very fuzzy. Should i create a backup homepage? That is is not always no 1 is understandable since someone else with my name exists (propably dead, causing more stir with that than a alive me!)
What do i want? I want people who can not remember my email to be able to find my homepage&email . (free provider yabaa yabaa)
I heard some sites used to fill up their backrounds with words that had a font the same colour as the background.
... I shall mention etoy, since I remember them to be at least one of the first to manipulate search engine rankings in an amusing manner. Does anyone else remember the "hijackings" they did? You entered a conventional search term, e.g. "cooking chicken", and the n-th link of the results, if you clicked on it, took to to a big ol' page saying "YOU HAVE BEEN HIJACKED" yadda yadda. Entertaining, albeit politically incorrect, especially nowadays.
... any hey presto, it was. Neat. The engine guys hated them, of course ... as did etoys.com - the altercation etoy.com had with *them* is probably where most people know them from.
The neat thing was that they had their rankings down to a fine art - they could say "we want our page to be no. 2 on yahoo, no. 5 on altavista, and no. 1 on webcrawler"
There's an article on wired, but I haven't been able to come up with anything on their internet hijacks.
Oh well. I'm feeling old in Internet years right now.
yes, we have no bananas
One of the Google ranking algorithms is based on how many people click on each offered link. By slashdotting a demo that discusses the 7th returned link on "mantids," enough people may click on the link to change its position. You can't interact with Google without changing it!
A very efficient way of boosting your google ranking is to be listed on the DMOZ Open Directory Project. Google gives so much importance to listings there, so it is a good idea to try your best to be listed there.
Also, since Google works by checking the links to you, it's a good idea to go through your web server logs to find which are the pages which are linking to you. Submit those to as many search engines as you can afford to. Besides having Google seeing more links to you, more people would be entering that page which is linking to you, and thus more people are likely to click on your link. Hence more hits!
IRC Resource / mirc.net
So, how does a link to google search results affect pageranks...? I wonder if it's possible to get google to return a link to the very same page it's displaying!
But what is www.osdn.com/about.shtml doing at number 4 on page 1!
--
Reverse outsourcing: it's the future
If, indeed, Google is ranking links by IP address then it's a brutally flawed concept to begin with: There are thousands of IPs that each host many (sometimes thousands) of websites via host headers and HTTP 1.1: If Google is confused that Shavlik is another page because it inherited it's IP, then Google can't be considered valid for any search.
I read an article in Phrack about how people could start setting up webpages, with links that are exploits. So say you made a web page with a link to www.blah.com/blah.asp?HHHHHHHHHHHHHH.... something like the code red exploit. Then when the a web indexer etc. comes around it will not only perform the exploit for you, it may end up indexing this expoit for others to find in search results. Although I don't think that google will archive something with a ?HHHHH.. on the end, many bots will probably follow any link they come across. That would be a search engine manipulation if you ask me, although quite different than say googlebombing.
Google uses a multi-part ranking schema:
Google does part of their ranking based on the number of links to a site from other sites, BUT they also weight the links based on the overall ranking of the site the links are from: a link to a malaria site from the CDC or WHO carries more weight than 10 links from pages like "My Malaria Facts".
One of their absolute killer ranking techniques, that is easy for anyone to exploit, evaluates the site content based on HTML structure. If you write properly structured HTML, and if your headings include words being searched for, you score higher than a site with the same appearance but elaborate use of FONT tage and fancy layout tricks. (yes, I monitor this, and no I won't tell you where the monitor pages are.)
Good concise pages, full of tightly focused content, with plain HTML links, still work best. (CGI sctipting and fancy authentication schemes lower your rankings).
Britney Spears Videos and MP3's
Britney Spears Nude
Britney Spears Naked Breasts
Britney Spears Porn
You can help in this experiment by adding these links to Angelfire, Geocities and AOL pages.
If, indeed, Google is ranking links by IP address then it's a brutally flawed concept to begin with
I didn't say Google is "ranking links by IP", I said it's confusing 2 sites that have occupied the same IP address. In fact, Google is doing all it's normal ranking procedures (text analysis and link analysis), but screwing up the very last step by associating the rank with the wrong URL. Yes, this is a big error, but it's easy to spot: If you click on a Google link, and the site you find is about what Google says it's about, this problem hasn't affected your search.
I actually doubt the problem is entirely Google's fault. Very few people have reported this problem, and whenever I've tried to help them, they turned out to be "strictly end-user" types who couldn't tell me anything useful about their server configuration. Therefore, I haven't been able to exclude the possibility that this Google "error" is prompted by misconfigured web servers.
You would be shocked as some of the silly misconfigurations enacted by commercial web hosting companies. For example:
Apparently a bunch of hosting companies have decided "404 Not Found" errors are obsolete, and started returning "403 Forbidden" responses when browsers/robots requested non-existant files from the web host customers. Unfortunately (for their customers) Googlebot interpreted those respones differently when it comes to robots.txt. "404" meant "no restrictions, come on in", while "403" meant "stay the hell out". So a bunch of customers who didn't know anything about robots.txt (and shouldn't need to) suddenly got their sites kicked from Google, because their hosting company confused Googlebot.
(Google has, in fact, recently changed their policy on 403 errors because of mistakes like this.)
Proud to be / Smiley-free / Since Nineteen / Ninety-Three
Juiz de Fora IRC Fotos
So I created a web page for my dad's small helicopter service business: www.dynamicaviationhelicopters.com
That was about two weeks ago. Google still doesn't know that it exists. How long does it usually take for Google to find a web page? What if the web page isn't linked to by anybody else, will it ever be found?