Mr Anti-Google
MrNovember writes "Salon is
running a story on some guy named Daniel Brandt who they call "Mr. Anti-Google." Mr. Brandt runs a sort of anti-establishment database of citations called NameBase as well as Google Watch. He claims that Google's PageRank system is undemocratic primarily because it doesn't rank his NameBase information very highly. He also points out that Google maintains a log of all you've ever searched for associated with a long-term cookie. Google's system seems to work the best if you ask me but, on the other hand, link popularity may not provide the most intelligent top rankings."
I don't accept cookies from Google.
I have discovered a truly marvelous sig, unfortunately the sig limit is too small to contain i
isn't the system that google uses better than the pay system Yahoo does? Yahoo searches have been coming up with some really whacked results, that are totally wrong (ie whoever payed more...) just my $.02
"an eye for an eye only makes the whole world blind"
Here's the actual Salon story:/ 29/googl e_watch/print.html
http://www.salon.com/tech/feature/2002/08
Don't know why that was posted without a LINK TO THE FREAKIN' Artcle, but..
o gl e_watch/index.html
http://www.salon.com/tech/feature/2002/08/29/go
Think outside the... Hey, where'd the friggin' box go?
IF people decide they don't like Google they'll search somewhere else. We vote by our searches...
I'm lazy, but a direct link to the story would have been nice.
Thanks, I already know how to get to Google and Salon. What I don't know is how to find the Salon article, especially after it scrolls off of Salon's front page.
Here is the link to the story:
Slon Article
And I think the Us Monetary system is unfair because I dont have enough of it!!!
No I didnt spell check this post...
He also points out that Google maintains a log of all you've ever searched for associated with a long-term cookie.
Good thing I search for p0rn with cookies, Java and JavaScript turned off! I also wipe my disk cache between sessions.
"I'm The Bounty Bear. I will find him anywhere. I'm searching."
The only link that actually does anything in this story is the google.com url... there isn't even a link to the story he was referencing.
l e_watch/index.html?x
http://www.salon.com/tech/feature/2002/08/29/goog
- what is the definition of simultanagnosia?! I've been meaning to look it up!
Oh, the irony.
Fine with me if he wants to complain, Google still remains my number one search engine, due to its highly relavent results. You can whine all you want, but that doesn't change reality. ~geogeek
a few links will fix namebase's spot on the list
So he feels let down that no one is interested in linking to his site? Is that what this is all about?
It isn't like Google is going out of their way to bump him down the list.
I have been pwned because my
It's not just link popularity.. where those links come from is also very important.
If a popular site links to yours, that has more weight than some one-off site that links to yours.. google takes this into account.
The guy can argue all he wants.. google does not pruport to have the best stuff at the top all the time.. but if this guy's site was so good, then more people would link to it, if more poeple linked to it, it would be more popular.
...he'd increase his page ranking on google if he removed the little tin foil hats from his servers.
The older your site is (and the better it is), the more likely that it will be linked to, and linked to well. If your site is new, small, or bad, very few people will link to you.
Compared to the other search engines, Google is great, and that's what matters. Is it possible that someone could make a better search engine? Maybe. Please, try. Competition is good for everyone.
Go To http://www.namebase.org/ -> "The document contains no data"
It's not the best way to attain the highest positions on Google (nor on other engines, for that matter) if you ask me...
Ciao,
Foggy
yeah I hate google because:
1) When I type in my name IT DOESNT SHOW IT!
2) My websites are not listed #1 NO MATTER WHAT YOU TYPE IN!
3) There image search doesn't have PHOTOS OF ME!
4) I hat all other search engines for THE SAME REASONS!
wa wa wa......
Ave Molech Setting
Yeah, I only seem to get this when I'm using mozilla, maybe they're trying something out?
His site isn't loading for me. Guess I'll have to go Google's cache to - oh, wait a minute... it's not in there! How rare!
Come to the University of Mars! Classes starting soon!
I just tried visiting google-watch.org, but it seems to be down ("document contains no data"). So I google for it.
Caching has been disabled for the site.
HOWTO get better dates on slashdot
I use nothing but... :)
Get your own free personal location tracker
As for the point made that this guy thinks that Google is "undemocratic," give me a break! Google is not a government - it is a search site! They exist to make a profit. They will make money by providing a quality search result, thereby attracting users. They are not in the business of being the arbiter of democratic principles on the web.
Laws affecting technology will always be bad until enough techies become lawyers.
Google is a very good search engine. And I don't know what the hell this Mr. Anti-Google is talking about, "undemocratic" everyoney knows google is powered by pigeon clustetrs, millions of pigeons voting on the relevant sites
I've thought for a while that, although Google is undoubtedly a fine search engine, it does make it difficult to get on it in the first place.
Since you need to have links to your site from other sites to get rated highly in Google, it is almost impossible to get them, as people who may be interested in linking to your site won't find it on Google.
Vivious circle, anyone?
Goblin
It's all fun and games until a 200' robot dinosaur shows up and trashes Neo-Tokyo... Again
although strangely enough, apparently so will links from subdomain sites like geocities, etc.
so now he merely has to complain about his monthly bandwidth allotment getting used up, and his serving crashing due to /.
He can't win
"It is a greater offense to steal men's labor, than their clothes"
It's funny how salon would focus on a total non-issue like this (for christ sake, just turn off cookies) but completely ignore things like Yahoo resetting everyones mail options to opt-in. I guess there wasn't some crank they could quote for that article.
Fear not, they'll soon be gone...
I am not a number! I am a man! And don't you
-brian
Hmm... I use Mozilla's Cookie Manage to completely protect myself from cookies. I let one or two through... the cookie from my company's website, slashdot's login cookie, etc...
In Mozilla -> Tools -> Cookie Manager -> Block Cookies from this site...
The next Slashdot story will be ready soon, but subscribers can beat the rush and slashdot the links early!
To his agenda perhaps.
However Google isn't used by most folks as a directory - it's a search engine. It simply pulls up entries according to a formula (see pigeonrank for the inside scoop) and gives those back. No bias beyond what smart webmasters can impart, no artificial clustering, etc.
If Google were to start doing as Brandt wants it would quickly run into endless battles, loose it's searching edge, become just another pay(or agenda)-for-play roadkill.
No thanks.
I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
How popular can http://www.namebase.org/ be if it goes down before 30 comments have been posted?
For crying out loud, my PERSONAL web site can handle more traffic than that.
What's he hosting it on, a dialup?
"Live Free or Die." Don't like it? Then keep out of the USA
I don't mind someone having their point of view, in fact I applaud Mr. Brandt for furthering what he beleives. However, search engine popularity is so flighty, if I think another engine is better than Google, I'll use it. Honestly, I have no ties to any search engine and feel I never will. However, Google has been able to stay at the top of the list (at least my list) for quite some time and has also managed to put the least amount of advertising (or harrassment) in my face. I used to used yahoo, until the pop-ups and ads overwhelmed me. I think much of Google's success came from the fact they never went public. This and the text based ads are incredible decisions when every other search engine was greedily grabbing web based advertising revenue. I like Google, I'll continue to use it, but I'm not going to fight for it either. Just my two cents.
Bill, can you factor this prime number for me?
I don't mind the google cookie's, I'm thinking that they use them to help focus your results over many different queries. For example, I tend to query a lot of linux related topics. Not sure, but I think that some of my searches on a "fresh" browser (ie. no google cookies yet) come up with slightly differently ranked results - usually worse than the browser with google cookies.
I have read on Slashdot comments on another story that once Google was backed by CIA, like sponsorship. Yea, I have read it.
I am really wondering if its true or not...
Hrm, both namebase and googlewatch seem to be down. Is this just an innocent slashdoting?
Or have the Google gods turned their clusters towards more sinister deeds, silencing their critics.
We may never know.
autopr0n is like, down and stuff.
link popularity may not provide the most intelligent top rankings.
/., why can't link popularity work well for a search engine?
opinion popularity seems to works well within
To his agenda perhaps.
However Google isn't used by most folks as a directory - it's a search engine. It simply pulls up entries according to a formula (see pigeonrank for the inside scoop) and gives those back. No bias beyond what smart webmasters can impart, no artificial clustering, etc.
If Google were to start doing as Brandt wants it would quickly run into endless battles, loose it's searching edge, become just another pay(or agenda)-for-play roadkill.
No thanks.
I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
First Red Hat, now google, I guess when your on top you need to prepare for unsubstantiated criticism.
PageRank works. If your page is linked to by a large number of well trafficked sites, then you get ranked higher. If your some crack pot whose site no one cares about, you don't get a high rank...
In other words, Brandt recognizes that there has to be some order to Google's results, and that some sites might deserve to come up before others. He just disagrees with the way Google does it. In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site critical of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorable Donald Rumsfeld."
Don't blame Google for equating accuracy and usefulness with popularity. It's either that or resort to subjective measures.
Upon reading the article, you find that Mr. Brandt's main complaint about Google is that he believes that when you type in, say, "Richard M. Nixon" into Google, the material he has compiled on Nixon should be ranked #1.
Okay, so I did a search on Nixon on Brandt's site. Here are the first couple of results:
(1) How the Vatican conspired to hide Nazi war criminals.
(2) How various activists were persecuted by the CIA and FBI.
Nowhere did I even SEE Nixon's name in these abstracts. The only relevance is that Nixon was alive at the time, or maybe president when some of them took place, but hardly the man personally responsible for all of them.
When I type "Nixon" into Google, I expect to see biographical material, both good and bad, not totally unrelated rantings. Google is doing its job, in my opinion. It is giving low rankings to Brandt's irrelevant materials. His complaints are pure self-centered sour grapes.
This guy's just whining because Google doesn't rank pages according to his crackheaded counterculture views? And this is news?
Google must be doing pretty well if this is the worst criticism they can find about them.
"Google ranks my muckraking site rather low with regard to searches on indivduals, so the algorithms they are using must be EVIL! EVIL!"
Test your net with Netalyzr
I've never seen where Google has put a cookie that does more than save my search settings. In fact I've never seen where it saves all my search terms. What is doing the search term saving is Internet Explorer doing the auto form filling bit.
This chap seems to be little more than someone who is holding a grudge against google because his website isn't as high on the list as he wants.
Well Tough @#$%, life sucks doesn't it.
What this guy needs to learn that what helps out with your score on Google isn't just the content, but how many people link to your site for that information. Thus having a page on Rumsfeld isn't as helpful as being a webpage on Rumsfeld that 50 sites refer to you.
If this guy wants a higher ranking then he has to make relationships with other websites to get his rankings up. It's not that hard as most webmasters know this and a link sharing helps them as much as it would him.
He's just a whiny person who happened to catch the attention of some person who needed to fill out todays news space on Salon.
Ignore him and hopefully he'll go away.
Phoenix
-- Wiccan Army, 13th Airborne Division "We will not fly silently into the night"
The last time I checked, Google wasn't a democracy. If it was, I wouldn't have voted for that name. Since it isn't - oh, well.
There's one in every crowd.
To celebrate the occasion of my 1000th post, I will post no more forever on Slashdot. Goodbye.
Last year we were able to use Namebase to identify a rogue investor as having trained at the knee of Robert Vesco. Remember Vesco? The most successful international swindler of all time, and friend of the Whitehouse plumbers? Same guy. Ordinary due diligence did not turn up this information. Brandt may be offkey on Google, but he gets my vote of thanks.
Maybe I'm missing something here, but how is this a violation of your privacy? I mean, the whole thing is that you are using their service for free and willfully sending them the data that you choose. Everyone gets to choose what they search for in a search engine. This isn't private information in any real way. Google is providing you the free service of looking up words that you have intentionally provided. You don't like them being associated with a cookie? Refuse the damn cookie! Really paranoid? Go wander the web on your own without a search engine!
At what point were you guaranteed the free and anonymous use of a search engine? You're not being forced to use it. The world doesn't discriminate against people who do not choose not to efficiently search the web.
People like this are blurring the privacy issue and focusing attention away from legitimate privacy issues.
Nonperiodic Central Trajectory
"I am some poor guy who runs a second-grade website and since I can't get google to list me high, I will elicit some news media to get my site slashdotted"
hope you like your servers toasty, bud.
My life in the land of the rising sun.
If letting Google rank the pages is undemocratic, what about a system in which, when you go to a page from a Google search, Google adds a frame at the top of your page that let's you vote on how useful this page was on a scale of 1-10?
Then, the most popular hits for a given set of search words would have their Google ranking rise. Now that's democracy.
#define sig "Every social system runs on the people's belief in it."
The site was brought down by the C [carrier lost]
[ok]
Fight Spammers!
Just watching the comments here, Google seems "untouchable" because they run on Linux Beowulf etc.
Grow up a bit, would ya? We are speaking about a www site storing your search terms in a cookie! Oh, turn off the cookies, yea right, average www people knows how to do it.
Google maintains a log of all you've ever searched for associated with a long-term cookie.
Google has not kept this secret. And personally, I think it's a good idea. Rather that sort search results by relevance to the keywords queried, they can use this information to sort the results by relevance to the types of sites I clicked through on previous searches. For an example, say I searched for "cookies", they would rate web development sites higher for me and baking sites higher for someone who frequents recipe sites.
I don't know if that's really what they do with the info but I think it would be a good idea. (how about we leave the personal privacy debate for another thread though...)
Ascalante: Your bride is over 3,000 years old.
Kull: She told me she was 19!
What this guy wants, by abolishing PageRank, is a return to the free-for-all of early search engines, where the loudest voice rules. If one page has more keywords, it's ranked higher -- whether or not those keywords appear in the context of relevant content.
Here's his real problem: he thinks that linking to "Donald Rumsfeld" should bring his site's page to the top, despite the fact that he has no actual content -- just a list of links to other pages with content.
He calls this a failing of PageRank. I call it whining. If he wants more links from Google, he should get the word out about his site (preferably without manipulating Salon.com into doing it for him) and add some actual information about the people he's archiving by hand, instead of just building a big hotlist about them.
Basically, he wants to be the tyrant he imagines Google to be. Well, let him want all he likes. Google's popular because it's good and it's relevant; the fact that a tiny tiny minority think it's not isn't a good reason to overthrow the whole system.
People who live in glass houses shouldn't throw stones. He should start by making changes on his own site, not insisting Google make changes on there.
Google responds by stating that now all of their pigeons will go through an "intruduction to democracy" short course, and all "bird seed" websites are now ranked by humans instead of the patented "pigeon rank" system.
My life in the land of the rising sun.
You mean this T-shirt?
People can theorize about Pagerank all they want and come up with 100 theories of why it's not correct and won't give you good results.. but guess what, that's all in theory, and in reality, Google gives amazing results. Pagerank will probably fall by the wayside in the years to come as more sophisticated algorhythms come along, but for now, it is ludicrous to suggest that it doesn't work, when you just have to search for anything on Google to see it's usefulness.
Also, this guy claims that Google keeps a record of what everyone searches for.. what proof does he have of this? That Google sends a cookie? That cookie is more likely than anything just used for tracking how often most people use the site, so they can create aggregate numbers of unique users, etc. Sure they could be tracking every search term, but why would they, think how much storage space that'd waste for no return. If the FBI ever wants to find out what this guy searches for, they'll just contact his ISP and have him monitored that way.
sig:
See the "..for smart people" banners Wired runs here? Look elsewhere guys.
If his main gripe about google is that he gets ranked lower on the list then he wants...Pay them for a sponsored link. It will get right up top in pretty colors :)
In the meantime, anyone who would like to cover their tracks can use my cookie:
.google.com TRUE / FALSE 2147368045 PREF ID=111439b95052c72a:TM=1030056425:LM=1030056425:S= v7T9QSFKEkI
;)
Of course, if it turns out that Google is planning to give a prize to the most active user, or they have some kind of search engine green stamps, you're screwed.
Proud member of the Weirdo-American community.
So it seems that this guy's real problem isn't with how Google ranks his site, but rather that Google isn't pushing his product to every searcher who hits their site. So he talks about the "undemocracy" of Google, but when it comes down to it, his main issue is that Google isn't helping his business, or rather, that Google's ranking algorithm isn't compatible with his business plan.
Too often, when people say something is undemocratic, it's just because they aren't getting there own way.
First, a link to the article:0 8/29/googl e_watch/index.html
http://www.salon.com/tech/feature/2002/
(might be a space inserted in the URL by the browser submission, apologies)
Second, a quote from the article:
"Brandt sees this as Google's major flaw. "I'm not saying there aren't some sites that are more important that others, bu t in Google the sites that do well are the spammy sites, sites which have Google psyched out, and a lot of big sites, corporate headquarters' sites -- they show up before sites that criticize those companies.
In other words, Brandt recognizes that ther e has to be some order to Google's results, and that some sites might deserve to come up before others. He just disagrees with the way Google does it. In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site crit i cal of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorabl e Donald Rumsfeld."
I must disagree with the ideal expressed here as Mr. Bran dt's. If I was searching for material on the Web about Donald Rumsfeld, I would rarely search for information critical of him *first*. If I was ego surfing on myself, I'd want to see my own material about me returned by Google, ahead of negative reviews and sites. I don't think that's an unfair way for Google to operate. While some of the issues Mr. Brandt raises might be valid, I do not feel that Google is required to promote or support Mr. Brandt's agenda over the agenda of the people and organizations Mr. Brandt chooses to focus on. M
google works best in my opinion... in almost every search the first result is what you're looking for... how many times does that happen on other search engines... who cares how they do it... if it works, use it... if this guy chooses to use another search engine and go through page after page of results befor finding what he wants, I say go for it... but don't complain...
Web Design Tips
Is it just me, or does this guy sound like yet another internet kook? Get "untied.com" ranked first when searching for "united airlines"? That makes no sense.
Google is a system -- a system that works a certain way. His complaints about PageRank are like complaining about an automobile for the way its wheels go 'round and 'round.
I'm surprised salon dedicated any article space to this.
--
bachiatari na torisetsu o yome!
This is especially popular with protesters, using the pagerank algorithm to rank higher than the company / organisation that they are protesting about. Its called google bombing. If he hates google so much then google bomb his site.
Really? Wow! That is just so tragic . That is just so sad.
/. goes by without a Stephen King is dead post, or the story by that loser that can't get laid and probally (thankfully) will never will because he's a whiny little putz...or whatever.
But wait a minute, didn't he die last week...and the week before...and the week before that...and the week before that...etc, etc.
This is probally gonna cost me some karma but screw it. I have to ask the question.
Why do you exist? Have you nothing better to do than to post the same bogus piece of news over and over again. Hardly an article on
Goddess preserve me from these random pieces of wasted, self-replicating genetic material
-- Wiccan Army, 13th Airborne Division "We will not fly silently into the night"
I plead ignorance. To me, WAP stands for 1) Wireless Application Protocol or 2) Wide Area Protocol, or somesuch. Everything2 yields 1), links to W3C, bafflegab and some lame jokes. Googling it gives the expected man-month of work to grok the results. So can someone save me (us) some effort and please elaborate on what a WAP Page is?
Im sorry but has anyone looked at this site? The main page looks horrible as the red block on the bottom for some stupid reason is made of somehting like 15 images. I went to their about us page, their nutshell page and their services page and I still didnt get what the hell they do. Not untill i went to the sucess stories section did I find out that these guys take a look at your product your trying to sell, see whos in the same market, and come up with a catchy name for it. This is found in their naming/branding section however if you dont know anything else about the company that section doesent make any sense. Maybe they should stop and think that the reason why their site isnt liked to is that its poorly constructed and the content is difficult to understand. I would like to know which of their clients they got through their webpage and which ones were acquired through word of mouth.
Brandt is not a disinterested party; the dispute between Daniel Brandt and Google is personal. He has spent thousands of hours building a Web site that he believes is both useful....
Thousands of hours building his website, let's see for every thousand hours spent on his website (24 hours a day makes about 41 days)
If he worked on that website for even 12 hours a day and it took him 1000 hours, that would be 83 days without even counting that he probably has a work too. This guy is a joke!
*MWWWWWWAHHHHHHHHHH*
:oX
There ya go, sweetie.
He has over 100,000 names on his web site. His complaint boils down to the fact that if you search for one of them (such as Donald Rumsfeld in the example), they don't show up in Google's search results. The reason they don't show up is because Google ranks search results by the number of sites that link to them, and of course most of his 100,000 pages aren't going to be linked to very often.
So basically this guy wants free indexing and advertising of his content from Google.
So he's crying cause his page isn't ranked high enough? Who friggin cares? Get a life man! Google pwns j00.
Seriously, this is a silly thing to fret over. Does he think this would improve his site somehow?
-Valiss
Um? If you don't like Google's ranking system, don't use it. Nobody's forcing you to. What's the issue?
slashdot!=valid HTML
I do not use Google to "browse" for information or find the one site that has it all and looks pretty. Normally I know exactly what I am looking for, or at least something very specific. I just need to find it. I start near the top of the Google results and work my way down until I get my answer or enough information to solve my quest. If I can't find it, I try different keywords. A resent search I had was an example of a fetchmailrc using preauth. Sure, there may be a few top notch fetchmail sites (and thousands of copies of the man page) out there but I'd be wasting my time viewing them if they did not have the specific example I am looking for. If I want general information on a subject, I make my searches simpler or use the Google catagories. If this guys page truely is as good as he believes, his creation will eventually make it's way up the ladder.
Bad boys rape our young girls but Violet gives willingly.
Why should I listen
to a kook whose site cannot
take some slashdotting?
Google is simply the best search engine on the Internet... Often, when something has loads of success like google does (i guess) then some stupid people tries to find or simply invent some bad things about it i think thats whats going on with this stupid story...
As thin as the anti-google positions are, you'd think the article would have suggested more of the possibility that it was a media pro -- a hired gun -- brought in to spread FUD about Google.
This idea was invented by Shampoo.
For example, if you were to search for "ebook"--you wouldn't find Project Gutenberg on the first, second, third, or fifth pages of results (sad, since they've been at it, what? 32 years now).
There are a lot of companies playing the linking-to-self game, mostly in a bid to force PG off and try to convince the uninitiated would-be ebook consumer that they need to pay inflated prices for etext versions of titles in the public domain.
I believe, in google's defense, they did somehow deny the worst offender--but if you were to search for a title or author that PG has, you'll almost never get it on the first page of google's rankings.
Of course, if I think PG might have the book, I'll search for it by, you know, "War+of+the+Worlds+Gutenberg," but for the newbie seeking information...
I know PG is a special case, and the distributed nature of that group effort doesn't really help, but, they ought to be just about first for ebook, library, etc.... since they have been... all along.
My $.02
Interconnected Enemies
Yet another example of the Slashdot community's mindless following of the latest/greatest Open Source project, in this case Google! In my experience with doing a LOT of searching, I find that nearly half of Google's links returned as supposedly "most popular" get you a 404-Not Found when you try to go to the page. And this is what they rank according to popularity? The most popular dead links? ANY search engine service that deliberately ignores META information deserves to be boycotted. ANY search engine that refuses to accept "suggested" links also deserves to be boycotted.
... because my page is unpopular.
Lars T.
To the guy who modded me down from perfect to terrible Karma - Apple haters still suck
I have one bit of advice for this guy; "Make something better and we'll use yours."
I have one factoid for this guy; "Cached copy."
He can start there.
My
Limekiller
You know, some people will whine about anything, especially if things don't go their way. They are typiclaly refered to as having 'sore loser' syndrome.
Care for some cheese with that whine?
#jlk
That if you don't like Google... then you shouldn't use Google. Duh. Why the holy crusade? If you think Altavista or hell, Netscape Search meets your needs, then use it. Why do people find it necessary to attack everything instead of being constructive.
I think, to be quite blunt, that this is a crock of shit.
One of the most important things in a civil society are the checks and balances critcism offers on any service, any government, any individual, indeed, any endeavor undertaken. These checks and balances, and the importance of public criticism, because of vastly greater importance when the perceptions and lives of many people are impacted.
This is true whether one is criticizing GNU, Linux, Richard Stallman, our corporate masters in the form of George Bush, Enron, WorldComm, Microsoft, Apple, Sun Microsystems, Red Hat, or whomever else happens to be in the hotseat at any given time.
If Google really were stacking their search results, criticism and a 'holy crusade' as you so snidely put it, would be a very important counterbalance in offsetting the corruption and distortion inherent in such a thing, particularly given how trusted Google is.
I disagree with the guys criticism, for what it is worth, and am an ardent user of Google. But I agree whole heartedly with the need for such criticism to keep the likes of Google honest, and to call them on the carpet when they do something shady or wrong (like they did when the caved to the Cult of Scientology's pressure to censor the search results revealing critics of that particular organization).
This "if you don't have something nice to say, don't say anything at all" is a fine creed for slaves or submissive corporate drones, but it has no place at all in the marketplace of intellectual thought or debate.
Now, on the other hand, if you'd like to argue for civil discourse instead of flame fests and random insults, I will be the first to add my voice to yours, but lest we forget, civil discourse can and must include criticism, sometimes vehement criticism. Indeed, such can often be the most important civil discourse being conducted.
The Future of Human Evolution: Autonomy
Thank you for verifying my theory... heh
Search: Netherlands history
Result: Teen Catholic barely-legal sluts from Holland!
Search: Mali Timbuktu empire
Result: Malian Cum-Slurping Sluts! Timbuktu Kama Sutra Style Mature Singles Waiting For You!
Thank goodness Google is here, even if it's not 100% perfect.
There are a huge number of yeast infections in this county. Probably because we're downriver from the bread factory.
Whenever you try to include a feedback mechanism into an affinity model (or AI) you run into the problem where the returned ranking itself influences the choice.
Basically, higher ranked items (items that appear first in the list) have a tendancy to be picked simply because they are first. In return, the picked item will be ranked higher the next time around not because it was more relavant but because it was closer to the top the first time it was listed.
To get around this feedback caused by the system itself, I've seen systems introduce a small amount of randomness to the results. In statistical terms, this would correspond to an uncertainty or error factor in the relavancy rating. This number might also correspond psychologically to the probability that someone will choose the higher of two items solely by listing order.
He is right, in a way. Google is biased towards older sites. This is a good thing, IMHO. I'd rather get my info from a well established source than some unknown site set up yesterday. If the new site is good enough, it'll rise in he rankings quickly. Quicker than if you whine about it.
in Google the sites that do well are the spammy sites, sites which have Google psyched out, and a lot of big sites, corporate headquarters' sites -- they show up before sites that criticize those companies."
In other words, Brandt recognizes that there has to be some order to Google's results, and that some sites might deserve to come up before others. He just disagrees with the way Google does it. In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site critical of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorable Donald Rumsfeld."
He wants google to be a political action site that favors his views. He's a whiny little baby.
Sites that critisize corporations should appear before the corporations main site? Why? Did you search for the company or for criticism? If the company/group in question was something he agreed with, perhaps some environmental organization or the democratic national commitee, would he want criticism of them to come up first too?
A quick stop at google shows that if you search for "United Airlines" you get their site first, and the site he thinks should be first shortly thereafter. If you search for "United Airlines criticism" you get the site he reccommends first. Looks like google is doing it's job correctly to me.
Why is salon publishing the crap?
Now wonder he doesn't get ranked highly on google...they're afraid of breaking him.
I am apparently unable to form coherent sentences while trying to type around the chili I spilled on my keyboard.
Please disregard the horrible mangling of the English language which took place in the parent comment.
The cookie is named 'PREF' and has some cryptic data stored in it. It took me a while, but this is what it says:
All your search terms are belong to us(writing this in Mozilla...)
One reason to use N7 instead of Mozilla:
"you want to have a built-in AIM client."
(okay, maybe it's not a good reason...)
... categorization is the future, unless you're looking for the official Britney Spears site, that is. If I'm going to search for some guy called Michael Jordan who is not the basketball star, there is no way I will ever find him with Google, especially since they never give you more than 2000 results (or is it 1000; try it, it's lame). However, with the best categorization technology out there (http://www.vivisimo.com), I might find him because he would be in some, say, Computer Science category assuming that he is a computer scientist. Sure, you can add "computer" to the list of keywords in Google but it's not the panacea when you're doing serious research.
Soon to be announced: Google for Wackos! With a clean-cut, cookie-less interface free of CIA influence, Google for Wackos will return search results based not on the listed sites' popularity, but on the wackiness of the conspiracy theories they present. Most popular search terms include Zapruder, tin foil, UFOs, and of course sex (but only the dirty illegal kind that politicians have.)
Genocide Man -- Life is funny. Death is funnier. Mass murder can be hilarious.
You can find a cache of the site at the wayback machine . :)
I know the guy will be gutted about that
no sig.
Now I understand Google is untouchable because it runs on Linux whatever it does and I switched my default search engine to alltheweb on Opera browser.
Thank you very much and now I started to believe at that "whining" guy.
is a very cool next step in search engines.
A web search engine favours established and reputable sites?!?!?!?!!! Why was I not informed! All this time I have been using Google to search for unreputable, poorly constructed websites that nobody likes! I've been mislead!
Really, can someone go and LART this guy? He sounds like he was given a computer for Christmas, and found this thar Worldy Widey Web thingy, but now hes pissed because it doesn't quite work as he expected.
Google is rough on short searches, but the more terms you add, it finds what I'm looking for pretty quickly.
Sleep is for the Weak
The answer is in the article. Six years ago, everyone used Yahoo. Then Yahoo went all portal on us, so the smart geeks started using Altavista. Then Altavista started selling #1 listings, so we all decamped to Google. Now everyone uses Google.
Brandt's complaint appears to be that he has a database of citations, but when you search for Donald Rumsfeld his site is more than 10 pages down, where nobody ever looks. And that's fine with me. That's what I expect from Google. He obviously expects something else (like united.com appearing higher than United Airlines real site), and being the kind of person he apparently is, he expects Google to change to become how he expects them to be, rather than realigning his expectations with reality.
--
E_NOSIG
Ironically, when www.namebase.org was /.ed, I found a google cached snapshot of the site...
"If you and I always agreed, then one of us would be unnecessary."
Now that he's being slashdotted (I get DNS not found) I'm sure that he won't be getting ranked by *anybody* for at least a few hours... heh
-Steve
If you get the google toolbar, it lets you vote for a page or against it. It's not a 1-10 scale, more like a -1/+1 scale, but over large numbers of users and rating submissions, both systems will average out to the same result.
Personally, though, I think any system like this is a complete waste of time (both for the user and the system developer). Google's system is much bettter. It's ranking system is judgement-free and value-free, since PageRank actually boils down to rankings which are equivalent to a frequency of hits with users who are randomly surfing the web.
Forget looking for a democratic system; Google's ranking system mirrors the **physical reality** of the web. What could be better than that?
You never enter any data into Google that allows them to associate the cookie with your name, or whatever, so what's the problem?
They don't carry adverts on their web site, so they're not trying to market anything to you, so what's the problem?
I think people have got to realise that not ALL cookies are bad. I'm not making this point because I like Google, I'm making the point because people associate cookies with evil intentions regardless.
--> PageRank is the "opposite of affirmative action," he has written, meaning that the system discriminates against new Web sites and favors established sites.
I put up a website about two weeks ago, a small fan site for a video game I am a fan of. At this point in time, there are several searches that come up with my site as the first site such as "gamename video download".
Although I thought the same thing he did, my "new" site gets the majority of its hits from google now..and it's less then 2 weeks old!
hahah, both google-watch.org and namebase is down. hahah
From the experience I can say that small Webmasters do not have any problem getting a high rank on Google if the content of the page is relevant and there are some links posted to the site.
My highest trafficked pages on my personal site, C++ interview questions, and Java interview questions achieved a top 5 ranking for the terms above without me really trying. Actually, the only way I found out about high rankings on Google is when my tracking system showed up 200 hits coming from google.com.
If a site can achieve reasonably high rankings with absolutely no effort, I don't really see Google being tyrannical or discriminatory in any way.
I have one response to this whiny dork: Operation Clambake. Operation Clambake is a criticism of Scientology. It is also ranked very highly by Google in searches for Scientology. Why? Because lots of other sites consider it important and related to Scientology. His pages are not ranked highly in relation to the political figures he tracks. Why? Because no one gives a damn about what he's doing.
Google is doing exactly what it should. The criticism sites that are respected get ranked highly, the cranks get modded down. The only problem here is that we have a whiny crank who conned a Salon writer into writing a story for him.
I never understand why people make such a big deal about cookies. If you don't want to be tracked (like me, like most of us here at slashdot) there are countless ways of protecting yourself via browser settings, CookieCop, Proximotron, etc. Anyone who really cares about privacy probably already knows how to disable cookies. And anyone who doesn't know probably doesn't care about privacy (my grandmother, etc.). It seems like people just enjoy complaining about a standard web technique even though it is easily circumventable.
Second, why the hell is slashdot even posting this article? I've skimmed plenty of the below comments and they all seem to agree that this anti-google guy is a goofball. Just because Salon ran an article on him doesn't mean that this fruitcake's complaints have any merit. Considering how many stories get rejected from slashdot on a daily basis, why was this chosen? Is it just me, or did anyone enjoy/learn from that article?
GMD
watch this
Actually, most search engines exist to make a profit by selling off the results to the highest bidder.
Google does this as well, except it clearly splits the paid results from the pure link popularity results, placing the "Sponsored Links" in a separate div that's set to float:right.
Will I retire or break 10K?
Dude, you don't deserve to be in Google. Your server couldn't even handle a good /.'ing and you expect Google to rate your pages at the top? How about you just go back to serving french fries to the masses, o'tay?
I'm surprised to see thinking people dismayed by this. Personally I think this guy could do with some support and titling the article "Mr Anti-Google" does not help, but then /. has become commercial even though people hav'nt quite relised it yet.
Why do people believe those companies who we rely upon are perfect in every way?
It's worth emphasising that there is also big money here at stake.
Also, interestingly, http://www.google-watch.org/ is down at the moment.
Dang XML didn't make it through in the post...
r ty noPolicyDefault="reject" noRuleDefault="reject" alwaysAllowSession="no">k iePolicy>S IEPrivacySettings>
<MSIEPrivacy>
<MSIEPrivacySettin gs formatVersion="6">
<p3pCookiePolicy zone="internet">
<firstParty noPolicyDefault="forceSession" noRuleDefault="forceSession" alwaysAllowSession="no">
</firstParty>
<thirdPa
</thirdParty>
</p3pCoo
<flushCookies/>
<flushSiteList/>
</M
</MSIEPrivacy>
Import that into the privacy settings for IE6.
I like having people like this around. They are the checks and balances that makes things better for all of us.
Keep up the good work, Daniel Brant.
Ok, starting from the top.
As best I can tell, the only information Google collects from me is my preference settings and the text of my searches. In addition, it collects "implicit" information--I am user number 71f612455 something or other according to the google.com "PREF" cookie. Apparently they can then discern that I, as a solitary user, searches for stuff like "chainless bicycles" and "rochester bands." They could probably figure out who I am based on what I search for.
In my opinion, this does not violate my privacy. It's akin to a taxicab driver noting where I want to go. In some weird alternate reality, they might give me a numbered card which I could show them when I got in the cab and they'd remember how warm I liked it in the cab or change the radio station. They could also keep track of everywhere I go. I think the analogy is pretty good, and I probably wouldn't mind the side effects all that much. Heck, for that matter, I don't have much concern over supermarket savings clubs--and they have my name and address as well, which does make me a little more concerned.
On the second point, I don't think it's too hard to get ranked high on Google. Make a page that people link to--or one that has unique information--and keep it around for a long time. That worked for me. Searching for purrs rochester band or rochester band weekly events and my JayceLand site comes up pretty high on the list. Even before my URL appeared on Slashdot from posting stuff, I was doing pretty good. Maybe if he did some research as to why Google hates him he'd find some stuff to correct.
Third, Google is acting like a benevolent monopoly. In fact, I don't think they can do otherwise. As soon as they sell off prominent keywords to porn sites, people will hate them, stop using them, and some other site will take its place. I guess if they continuously spidered competing search engines to kill them off, that would be wrong but transparent to the users ... then we'd need some kind of government action. Other than that they're doing okay by me.
--- Jason Olshefsky
Karma: Poser (mostly affected by adding this line long after everyone else did)
Are they down or is our network blocking them...?
Google could put long term cookies to good use if they do it responsibly. They can't tell which search result you click due to direct linking (although they could see which search results you pull from their cache) - but they could definitely bias your query results based on your previous queries. Example:
UserA has a pattern of computer-related querys
UserB has a pattern of photography-related querys
Both search for "lens effects".
UserA gets some information on computer graphics lens effects in software and algorithms ranked higher.
UserB gets some info on how to use real camera lenses to achieve neat lens effects in photography ranked higher.
11*43+456^2
Sounds like what this guy is really interested in is 3rd party comment servers.
*does a search for com2kid*
*notice his site comes up on top*
*is happy*
*does a search for low polygon count inorganic*
*notice his site comes up on top*
*is happy*
*does a search for low polygon hire*
*notice his site is lised third*
*is happy*
All of $0 on advertising, but I am on Google and Yahoo.
Who says the little man cannot get ahead?
Need help treating your acne? Come here!
Short answer: It's a bug/quirk/feature of IE that, somehow, the page came across screwed up and got cached that way and, despite anything and everything you may have told IE about "check for a new page every time I visit...", it still checks this screwed up cache version first. The solution is to delete your temporary Internet files (Tools->Internet Options->"Delete Files" in "Temporary Internet Files")
Long answer:I had this problem with a site I frequent quite a bit. Since I know the author personally I told her about it. When I would actually save and view the page as prompted I would see all the HTML like I was supposed to but tons and tons of gibberish right before it. I told her to republish her blog but that didn't do it. No one else on her forum was having these problems and I figured since I told IE to check for a new version of the page every time that that couldn't be it. However, after clearing my cache out that did it.
Slightly More Elegant Solution: Instead of setting your homepage to Google, get the Google Toolbar. This way you can set your homepage to whatever and use the Toolbar to do whatever Google searching you want. With all the options its got it's easily the most useful thing I've ever used. Be sure to check the experimental options as well.
Schnapple
When I type "Nixon" into Google, I expect to see biographical material, both good and bad, not totally unrelated rantings. Google is doing its job, in my opinion. It is giving low rankings to Brandt's irrelevant materials. His complaints are pure self-centered sour grapes.
Perhaps he can start his own search engine just for trolls.
I run a modest webhosting business. Someone else runs a large, popular webhosting business who's name is very close to mine (two letters in the name are reversed). I have never ran any other webhosting business, while the other company has been doing this for years (under other names). This company registerd thier name 6 months after I registered mine, so they can be considered a copy-cat outfit. (i wont give my name, because that would be in bad taste, but as an example, assume I registered abchosting.com while the other guy registered acbhosting.com).
Anyways, a search for abchosting on google would give you acbhosting as the first result because the admin "accidently" mispelled acbhosting on several pages on his site. He also linked to that page from several different domains that he owns and therefore was ranked higher than my site.
PageRank sucks for this reason, and thats what the guy is complaining about.
Google ranks sites in terms of who links to them. Sounds pretty democratic to me.
I don't understand how 'poor,' 'bad,' 'less relevant,' 'negative adjective here' ranks can be described as 'undemocratic.' Democracy is what brings us schoolyard cliques and incompetent government. By definition, the 'best' isn't judged the winner - it's simply a popularity contest.
Some sites report that more than half or 75 percent of their referrals come from Google -- those are scary numbers.
Yeah that is scary! We have an effective search engine at our disposal! Who'd have thunk?
Ascalante: Your bride is over 3,000 years old.
Kull: She told me she was 19!
Oh, so now we see this conspiracy in action.
Man (Brandt) challenges God (Google).
God laughs at man.
God manipulates Slashdot to kill the web server of said Man.
God laughs at his almighty invention. The Slashdot effect.
Slashdot, God's tool, remains all powerful, while the Man's tool will nigh be linked to again.
Success!
-S
We Apprentice Developers and Designers
Well, he now has it. How much more advertising can he get then Slashdot? Hell it's free! Come to think of it, MrNovember sounds spaciously like MrAntiGoogle. I wonder if these are the same people.
grep >= ! == $your
This guy is an absolute moron. Lower in the Salon article it says he would rank sites critical of a company before that company if you search on the company name. His example is untied.com should come up before united's corporate site on a search for united airlines. Let me tell you something. If I was looking for united's site for schedules and I had to wade through satire sites and unitedsucks sites before finding the link I wanted I'd find another search engine. It'll be a miracle if this malcontent isn't on the news with a deer rifle on top of a clock tower in a clown suit. What a jerk-off.
Can't believe they hired this Manjoo person if this is what he churns out -- whiney complaints from interested individuals that other companies don't favor his own. Salon must be going down the tubes if this is the best talent they can attract. Or pay for.
Even shorter answer: IE sucks.
from Mr.Anti-Google to Mr. Anti-(Google&Slashdot) google: for not finding his page slashdot: for killing is page!
After reading about this Jackass and his ranting. I'm really happy his crap isn't at the top of google's page ranking. I would say, "If only Google would let me filter out his website", (I won't post the URL because that might acutally help his ranking.) But he seems to have done a good job of this on his own.
For example if I'm searching for United Airlines, I want UAL.com, I'm not interested in untied.com. If I were interested in "How UAL treats its Own" I would type that into my search engine.
If I search for "Mickey Mouse" I want a site about the rat, not one about how Disney is abusing trademark, copywrite laws, or the DCMA.
I would say, "If only Google would let me filter out his website", (I won't post the URL because that might acutally help his ranking.) But he seems to have done a good job of this on his own.
BTW I tried searching Google for Donald Rumsfield at www.namebase.org and I got the following result:
Your search - Donald Rumsfeld site:www.namebase.org - did not match any documents.
I think Google's page rank for his site is dead on.
If someone is passing you on the right, you are an asshole for driving in the wrong lane.
This isn't news, it's just a leftist with a personal axe to grind. So what else is new? Google doesn't rank his site high enough, therefore their page ranking methods are faulty, and they're part of a government/military/industrial conspiracy as well.
This isn't news, it's just life in our brave new socializd world.
A search for "United airlines" should come up with sites criticizing united before the actual company site? Why the fuck should google dictate political views in their searches?
Salon is easy to troll.
Anyone who's read USENET for any length of time can see that this guy is just a net.kook. Of course, on USENET, he'd just be mocked by the Kibologists.
Google is not perfect, you can inflate your own web sites results by just increasing the number of links to your site. In effect this sets the barrier higher for getting content indexed and ranked vs some of the earlier search algorithms.
So there are some cases where the better content will be obscurred by the well placed content. But this is how our society works also. If you get published in the New York Times more people will read your stuff versus the local newspaper, but Google doesn't seem to set the barrier too high. You can still write good stuff and if you get linked from enough reputable places then you can eventually become a highly ranked site. Eventually in this case means weeks and months and not years.
The only problem with Google is that everyone is using them, so it is becoming a single point of failure and/or corruption. It would be best if they had a healthier competitor in their class.
I'm completely serious: Who posts links anymore? I have a couple of "neat site" links on my personal page, but I'm the exception: Very few people put up "sites I like" link sites, as was the case in the early days of the net where the PageRank system made sense. Indeed, the opposite is true and people intentionally don't link anymore, lest they lose eyeballs.
People no longer need index sites like Yahoo or "The Best Places To Buy Curry Beans in Toronto" because they have google...but google relies on links to do its rankings....you can see the paradox here: With every passing scan by the Google spider, Google's usefulness declines.
As a webmaster that started my site back in April 2002 I can relate to a bit of what Brandt says. It's a struggle to get indexed and getting a decent pagerank. You have to spend a lot of time trading links with related (or not) sites.
On the other hand as a user of Google I do think that PageRank gives a fairly good measure of relevance. I would NOT want the "affirmative action" that Brandt talks about where small new websites would be given preferential treatment. There are just too many of them.
How ironic is it that when I go to read the comments on this story, that there's a big as ad for MS Visual Studio .NET?
My sig of choice is Marlboro
You mean if someone come sup with a search engine formula that ranks pages according to viewership popularity like goolge has doen its not democartic?
You mean our goign to our favorite sites like slashdot.org votes don;t count?
This guy belongs in a program that keeps him far way from congressmen like Coble.. we got enough Coble problems already!
Don't Tread on OpenSource
Of course they're not perfect, but Google is doing some very cool things. Of late I've been finding myself using their news.google.com more and more. I just wish they archived more and you could search back further.
An interesting concept, though, to have an algo 'create' a 'top news' page rather than a human editor (ala Yahoo News).
In fact, w/out a human editor it may eventually become more 'Fair and Balanced' (unlike some of the spin stations...)
Is interesting sometimes to see Google News show headlines for the same story from different news organizations. Sometimes you can clearly see their 'slant' from the wording of the headline.
Keep up the good work Google.
J-Log: Journalism News, Media Views
Brandt mentioned how all your search terms are saved in google's cookie for 36 years. I primarily use Google Toolbar, and I reguarly do a Clear Search History. Is that good enough to wipe out the searches? I looked at the cookie, and there's nothing indicating my search terms are in there.
I realize that a few people posted the link to the story before I did. And what happens then is, the first guy gets +5 Informative and everyone after that gets -1 Redundant.
Think about this. When I posted this link, there was only one comment. By the time I actually previewed and hit submit, there were about 20. I can't be sure that one of those people is going to post a link, so I just post and hope.
Maybe next time your moderation points would be better spent on a Troll or somone who is actually making a redundant comment. If I had been left at +1 (my default), my comment would have been drowned out after about 30-40 posts anyway.
My two cents... and probably another 2 points for being redundant again.
link popularity may not provide the most intelligent top rankings
Uh, dude... you do know that PigeonRank was an April Fools joke.
Please tell me you know that.
...we could make him a violin small enough.
"In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site critical of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorable Donald Rumsfeld.""
untied.com is the first search result if you type "hate united airlines"
if you want negative press, use negative language and google gives you negative-minded search results.
google works the way he wants except he's too lazy to type 4 more letters.
(Sing to tune of "traffic lights" Monty Python Skit)
I like google search
I like google search
I like google search
I like google search
Especially cause its free!
Brandt fears that law enforcement officials could muscle Google into divulging all the terms you've ever searched for. Those terms could be "a window into your state of mind,"
Damn, so that's why I keep spam with keywords:
"Teen, Spank, Rodentia"
Have to remember to delete my cookies EVERY time.
* Making waffles just so I have something to Twitter *
Google doen't count number of links to a page to sort the results. A link from yahoo is weighted much more than a link from mymotherssocks.com. It is diffusive in a sense that bunch (thousands) of unpopular sites cannot become popular by just linking to each other. You need some very popular sites to link to these websites (Like a source of energy).
Google has been a strong advocate of including all sides in the ranking. They very clearly came down on the side of xenu.org when the Scientologists came knocking. Moreover, their policy to ensure that pro-Scientology sites appear before their detractors makes perfect sense, just as I would fully expect to see the history, culture and art of African Americans to appear before the Arian Nation when I search on "african american."
For a quality search engine, the balance should always be tilted in favor of the individuals directly referenced in the search so that those parties can address grievences and possibly make amends without having to simultaneously battle for position in the ranks to be heard. If I wanted the contrary view, I would have explicitly asked for it. Since google sends me the contrary view anyway, I appreciate that they sort it after the material that I requested.
Lastly, I find it interesting that someone who starts out with "this is a crock of shit" should summarize with "if you'd like to argue for civil discourse instead of flame fests and random insults, I will be the first to add my voice to yours." I would not have even replied to this, except that it was modded +3, which frankly, I find rather sad.
-Hope
Instead of getting the toolbar i just went here and installed it for IE. Now when ever I hit CTRL-E I get a quick little google search off to the side and I only lose real estate on my window when I need to.
He's so worried about privacy and such, why did my search at Namebase.org get truncated because "No one at has donated money to namebase"? The site noted and probably logged my IP, looked up my organization in its database, and found that it had not donated to his cause.
Fsck him.
I'm sorry, I missed the "The most democratic search engine in the world" quote on the Google web site. Can someone post that link for me?
Google's page rank isn't democratic, and thank God for that. Otherwise I'd have to wade through a bunch of crap that I generally don't want to wade through.
Different search engines are better at searching for different things, but Google is my first choice almost every time. It is, by far, the most effective search engine I've seen. If it wasn't, I don't think it would be the most popular.
Someone explain to me why anyone pays attention to this guy.
A better system than page rank would be for google to purchase a large university. Get rid of the undergrads, but keep all the profs, grad students, TA's, and researchers. Have them devote all their time to ranking pages based on quality. If one university isn't enough, buy more.
Could be a bit expensive, but aside from that would allow for more intelligent assessment and ranking.
Hi all. I'm the evil Daniel Brandt who has the gall to criticize your beloved Google. Sorry the site is down. We're being synflooded, apparently by one or more slashdotters, since it started with the slashdot post. It's probably one of those who posted here, saying that if we can't keep our site going, then we don't belong in Google. We have our own router, so we hope to be able to clear things up shortly.
A few points missed in the Salon piece:
I specifically pointed out to the author of the piece when he interviewed me, that I felt my site did okay in Google, and that I was speaking for the public interest. The so-called "royal we" that Mr. Manjoo, the interviewer and author, refers to sarcastically, is used because I'm speaking for a tax-exempt, nonprofit public charity, Public Information Research, Inc. We do not sell widgets. Some of the comments in Slashdot have me mixed up with another person who is selling ads based on PageRank. But then, who expects Slashdotters to actually read the article?
My main site in Google is www.pir.org and it has a PageRank of 7. The www.namebase.org, with a PR of 6, is a streamlined CGI version of the main site, without all the essays and cartoons. NameBase began in the early 1980s and has been on the Internet since early 1995.
The other problem I have with the author's spin is that a good half of the interview was about Google's cookie. Most of the work I put into www.google-watch.org has to do with the cookie. In the article, the cookie is briefly mentioned, and most of the article is about how selfish and silly I am to think that Google should rank me higher.
My complaint about Google is not that PIR got the short end of the stick from Google, but that Google's stick should be longer.
My essay about PageRank is below.
_____________________
PageRank: Google's Original Sin
by Daniel Brandt
By 1998, the dot-com gold rush was in full swing. Web search engines had been around since 1995, and had been immediately touted by high-tech pundits (and Forbes magazine) as one more element in the magical mix that would make us all rich. Such innovations meant nothing less than the end of the business cycle.
But the truth of the matter, as these same pundits conceded after the crash, was that the false promise of easy riches put bottom-line pressures on companies that should have known better. One of the most successful of the earliest search engines was AltaVista, then owned by Digital Equipment Corporation. By 1998 it began to lose its way. All the pundits were talking "portals," so AltaVista tried to become a portal, and forgot to work on improving their search ranking algorithms.
Even by 1998, it was clear that too many results were being returned by the average search engine for the one or two keywords that were entered by the searcher. AltaVista offered numerous ways to zero in on specific combinations of keywords, but paid much less attention to the "ranking" problem. Ranking, or the ordering of returned results according to some criteria, was where the action should have been. Users don't want to figure out Boolean logic, and they will not be looking at more than the first twenty matches out of the thousands that might be produced by a search engine. What really matters is how useful the first page of results appears on search engine A, as opposed to the results produced by the same terms entered into engine B. AltaVista was too busy trying to be a portal to notice that this was important.
Enter Google
By early 1998, Stanford University grad students Larry Page and Sergey Brin had been playing around with a particular ranking algorithm. They presented a paper titled "The Anatomy of a Large-Scale Hypertextual Web Search Engine" at a World Wide Web conference. With Stanford as the assignee and Larry Page as the inventor, a patent was filed on January 9, 1998. By the time it was finally granted on September 4, 2001 (Patent No. 6,285,999), the algorithm was known as "PageRank," and Google was handling 150 million search queries per day. AltaVista continued to fade; even two changes of ownership didn't make a difference.
Google hyped PageRank, because it was a convenient buzzword that satisfied those who wondered why Google's engine did, in fact, provide better results. Even today, Google is proud of their advantage. The hype approaches the point where bloggers sometimes have to specify what they mean by "PR" -- do they mean PageRank, the algorithm, or do they mean the Public Relations that Google does so well:
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."
Google goes on to admit that other variables are also used, in addition to PageRank, in determining the relevance of a page. While the broad outlines of these additional variables are easily discerned by webmasters who study how to improve the ranking of their websites, the actual details of all algorithms are considered trade secrets by Google, Inc. It's in Google's interest to make it as difficult as possible for webmasters to cheat on their rankings.
It's all in the ranking
Beyond any doubt, search engines have become increasingly important on the web. E-commerce is very attuned to the ranking issue, because higher ranking translates directly into more sales. Various methods have been designed by various engines to monetize the ranking situation, such as paid placement, pay per click, and pay for inclusion. On June 27, 2002, the U.S. Federal Trade Commission issued guidelines that recommended that any ranking results influenced by payment, rather than by impartial and objective relevance criteria, ought to be clearly labeled as such in the interests of consumer protection. It appears, then, that any algorithm such as PageRank, that can reasonably pretend to be objective, will remain an important aspect of web searching for the foreseeable future.
Not only have engines improved their ranking methods, but the web has grown so huge that most surfers use search engines several times a day. All portals have built-in search functions, and most of them have to rely on one of a handful of established search engines to provide results. That's because only a few engines have the capacity to "crawl" or "spider" more than two billion web pages frequently enough to keep their database current. Google is perhaps the only engine that is known for consistent, predictable crawling, and that's only been true for less than two years. It takes almost a week to cover the available web, and another week to calculate PageRank for every page. Google's main update cycle is about 28 days, which is a bit too slow for news-hungry surfers. In August, 2001 they also began a second "mini-crawl" for news sites, which are now checked every day. Results from each crawl are mingled together, giving the searcher an impression of freshness.
For the average webmaster, the mechanics of running a successful site have changed dramatically from 1996 to 2002. This is due almost entirely to the increased importance of search engines. Even though much of the dot-com hype collapsed in 2000 and 2001 (a welcome relief to noncommercial webmasters who remembered the pre-hype days), the fact remains that by now, search engines are the fundamental consideration for almost every aspect of web design and linking. It's close to a wag-the-dog situation. That's why the algorithms that search engines consider to be consistent with the FTC's idea of impartial and objective ranking criteria deserve closer scrutiny.
What objective criteria are available?
Ranking criteria fall into three broad categories. The first is link popularity, which is used by a number of search engines to some extent. Google's PageRank is the original form of "link pop," and remains its purest expression. The next category is on-page characteristics. These include font size, title, headings, anchor text, word frequency, word proximity, file name, directory name, and domain name. The last is content analysis. This generally takes the form of on-the-fly clustering of produced results into two or more categories, which allows the searcher to "drill down" into the data in a more specific manner. Each method has its place. Search engines use some combination of the first two, or they use on-page characteristics alone, or perhaps even all three methods.
Content analysis is very difficult, but also very enticing. When it works, it allows for the sort of graphical visualization of results that can give a search engine an overnight reputation for innovation and excellence. But many times it doesn't work well, because computers are not very good at natural language processing. They cannot understand the nuances within a large stack of prose from disparate sources. Also, most top engines work with dozens of languages, which makes content analysis more difficult, since each language has its own nuances. There are several search engines that have made interesting advances in content analysis and even visualization, but Google is not one of them. The most promising aspect of content analysis is that it can be used in conjunction with link pop, to rank sites within their own areas of specialization. This provides an extra dimension that addresses some of the problems of pure link popularity.
Link popularity, which is "PageRank" to Google, is by far the most significant portion of Google's ranking cocktail. While in some cases the on-page characteristics of one page can trump the superior PageRank of a competing page, it's much more common for a low PageRank to completely bury a page that has perfect on-page relevance by every conceivable measure. To put it another way, it's frequently the case that a page with both search terms in the title, and in a heading, and in numerous internal anchors, will get buried in the rankings because the sponsoring site isn't sufficiently popular, and is unable to pass sufficient PageRank to this otherwise perfectly relevant page. In December 2000, Google came out with a downloadable toolbar attachment that made it possible to see the relative PageRank of any page on the web. Even the dumbed-down resolution of this toolbar, in conjunction with studying the ranking of a page against its competition, allows for considerable insight into the role of PageRank.
Moreover, PageRank drives Google's monthly crawl, such that sites with higher PageRank get crawled earlier, faster, and deeper than sites with low PageRank. For a large site with an average-to-low PageRank, this is a major obstacle. If your pages don't get crawled, they won't get indexed. If they don't get indexed in Google, people won't know about them. If people don't know about them, then there's no point in maintaining a website. Google starts over again on every site for every 28-day cycle, so the missing pages stand an excellent chance of getting missed on the next cycle also. In short, PageRank is the soul and essence of Google, on both the all-important crawl and the all-important rankings. By 2002 Google was universally recognized as the world's most popular search engine.
How does PageRank measure up?
In the first place, Google's claim that "PageRank relies on the uniquely democratic nature of the web" must be seen for what it is, which is pure hype. In a democracy, every person has one vote. In PageRank, rich people get more votes than poor people, or, in web terms, pages with higher PageRank have their votes weighted more than the votes from lower pages. As Google explains, "Votes cast by pages that are themselves 'important' weigh more heavily and help to make other pages 'important.'" In other words, the rich get richer, and the poor hardly count at all. This is not "uniquely democratic," but rather it's uniquely tyrannical. It's corporate America's dream machine, a search engine where big business can crush the little guy. This alone makes PageRank more closely related to the "pay for placement" schemes frowned on by the Federal Trade Commission, than it is related to those "impartial and objective ranking criteria" that the FTC exempts from labeling.
Secondly, only big guys can have big databases. If your site has an average PageRank, don't even bother making your database available to Google's crawlers, because they most likely won't crawl all of it. This is important for any site that has more than a few thousand pages, and a home page of about five or less on the toolbar's crude scale.
Thirdly, in order for Google to access the links to crawl a deep site of thousands of pages, a hierarchical system of doorway pages is needed so that crawler can start at the top and work its way down. A single site with thousands of pages typically has all external links coming into the home page, and few or none coming into deep pages. The home page PageRank therefore gets distributed to the deep pages by virtue of the hierarchical internal linking structure. But by the time the crawler gets to the real "meat" at the bottom of the tree, these pages frequently end up with a PageRank of zero. This zero is devastating for the ranking of that page, even assuming that Google's crawler gets to it, and it ends up in the index, and it has excellent on-page characteristics. The bottom line is that only big, popular sites can put their databases on the web and expect Google to cover their data adequately. And that's true even for websites that had their data on the web long before Google started up in 1999.
What about non-database sites?
There are other areas where PageRank has a negative effect, even for sites without a lot of data. The nature of PageRank is so discriminatory, that it's rather like the exact opposite of affirmative action. While many see affirmative action as reverse discrimination, no one would claim (apart from economists who advocate more tax cuts for the rich) that the opposite, which would be deliberate discrimination in favor of the already-privileged, is a solution for anything. Yet this is essentially what Google claims.
Those who launch new websites in 2002 have a much more difficult time getting traffic to their sites than they did before Google became dominant. The first step for a new site is to get listed in the Open Directory Project. This is used by Google to seed the crawl every month. But even after a year of trying to coax links to your new site from other established sites, the new webmaster can expect fewer than 30 visitors per day. Sites with a respectable PageRank, on the other hand, get tens of thousands of visitors per day. That's the scale of things on the web -- a scale that is best expressed by the fact that Google's zero-to-ten toolbar is a logarithmic scale, perhaps with a base of six. To go from an old PageRank of four to a new rank of five requires several times more incoming links. This is not easy to achieve. The cure for cancer might already be on the web somewhere, but if it's on a new site, you won't find it.
PageRank also encourages webmasters to change their linking patterns. On search engine optimization forums, webmasters even discuss charging for little ads with links, according to the PageRank they've achieved for their site. This would benefit those sites with a lower PageRank that pay for such ads. Sometimes these PageRank achievements are the result of link farms or other shady practices, which Google tries to detect and then penalizes with a PageRank of zero. At other times professional optimizers get away with spammy techniques. Mirror sites and duplicate pages on other domains are now forbidden by Google and swiftly punished, even when there are good reasons for maintaining such sites. Overall, linking patterns have changed significantly because of Google. Many webmasters are stingy about giving out links (which can dilute your transference of PageRank to a given site), at the same time that they're desperate for more links from others.
What should Google do?
We feel that PageRank has run its course. Google doesn't have to abandon it entirely, but they should de-emphasize it. The first step is to stop reporting PageRank on the toolbar. This would mute the awareness of PageRank among optimizers and webmasters, and remove some of the bizarre effects that such awareness has engendered. The next step would be to replace all mention of PageRank in their own public relations documentation, in favor of general phrases about how link popularity is one factor among many in their ranking algorithms. And Google should adjust the balance between their various algorithms so that excellent on-page characteristics are not completely cancelled by low link popularity.
PageRank must be streamlined so that the "tyranny of the rich" characteristics are scaled down in favor of a more egalitarian approach to link popularity. This would greatly simplify the complex and recursive calculations that are now required to rank two billion web pages, which must be very expensive for Google. The crawl must not be PageRank driven. There should be a way for Google to arrange the crawl so that if a site cannot be fully covered in one cycle, Google's crawlers can pick up where they left off on the next cycle.
Google is so important to the web these days, that it probably ought to be a public utility. Regulatory interest from agencies such as the FTC is entirely appropriate, but we feel that the FTC addressed only the most blatant abuses among search engines. Google, which only recently began using sponsored links and ad boxes, was not even an object of concern to the Ralph Nader group, Commercial Alert, that complained to the FTC.
This was a mistake, because Commercial Alert failed to look closely enough at PageRank. Some aspects of PageRank, as presently implemented by Google, are nearly as pernicious as pay for placement. There is no question that the FTC should regulate advertising agencies that parade as search engines, in the interests of protecting consumers. Google is still a search engine, but not by much. They can remain a search engine only by fixing PageRank's worst features.
*
[Daniel Brandt is founder and president of Public Information Research, Inc., a tax-exempt public charity that sponsors NameBase. He began compiling NameBase in 1982, from material that he started collecting in 1974, and is now the programmer and webmaster for PIR's several sites. He participates in various forums where webmasters share observations about the often-secretive algorithms, bugs, and behavior of various search engines. Brandt has been watching Google's interaction with NameBase ever since Google, in October, 2000, became the first search engine to go "deep" on PIR's main site by crawling thousands of dynamic pages.]
I'm very much an advocate of Google and have been for at least 3-4 years - I've turned many people on to it. Once you've tried Google, you'll never go back to Altavista (my favorite 5+ years ago, but it quickly started to suck).
But I have a major bone to pick with Google. I don't know if it is still the case, since the company where we experienced the following is no longer in business, but:
My old company registered a domain name 6 years ago. We built a web-site with contact info and proceeded to get all kinds of e-mails for products that weren't ours. It turns out that someone else had previously held the domain name and had gone tits-up, leaving a lot of unhappy customers out there. We were both in electronics but in radically different areas.
We tried ourselves to locate the other company so that we could point people to them, but only came up with a never answered phone number in Illinois. So we developed a stock polite response to these people wishing them luck and put a disclaimer on the web-site.
For the (300+) sites erroneously linking to our site where there was contact info, we sent polite e-mails asking them to remove the link, and got one of three results:
1) OK, will do - about 20%
2) No response - 70%.
3) F- you. - about 10% You wouldn't believe the arrogance of some people who felt that they had no duty to update their bad links. The most arrogant of all was a "webbastard" for a UK university site. He first claimed to have no such link and told us not to bother him - he had more important things to do (yes, he really said this). We sent him the page. He then told us that we should know better than to use a previously registered domain name. He portrayed tremendous ignorance of the living state of the web. We tried to explain to him (politely, in spite of his arrogance) that a domain is like a house - people move in, people move out. He finally changed the link but told us to f-off in the process (to which we sent him a polite thank you note).
The worst traffic sender was Yahoo directories (or whatever it's called) and it was impossible to get any Yahoo to do anything about it. There was no place to report this and we couldn't cram an explanation into a comment form that old took something stupid like 128 characters. We spent a lot of time trying to get Yahoo's attention, but they were an impenetrable monolith. We gave up on that.
Web half-life being what it is, things quieted down over time. But then a curious thing started happening. The e-mails from people looking for the old company started climbing again. We would ask these people how the got to our site and most were coming from Google. I was already aware of Google and using it by that time and pondered why, since Google seemed pretty "on-the-ball". Then I read about how their page-ranking works and it hit me. Google does not (or did not then) verify the links that they use for weighting. They simply weighted any link that they could find against a site. So the old links to the previous company's site, of which all pages but index.html were named differently than our pages (so, lots of 404's), were now being used by Google to point erroneously to our site. We sent mails to Google, but got nothing more than an acknowledgement. This seems like a major flaw in Google's logic. I don't know whether it was ever fixed or not.
As I say, I like Google and use it almost exclusively. I do use Alltheweb as a backup (their advanced search is better than Google's and allows more terms). I glad to see that Google got rid of the "XYZ" is a common word crap as well - when I do a literal search, words like "a, the, by" matter to me.
Google is quite good but Google is most definitely not perfect.
Until Oliver Stone gets a bug up his ass about it, that is...
Hokey statistics and ancient misconceptions are no match for a good thought in your head, kid!
Does anyone seriously believe that that Google can somehow fit 36 years worth of search terms in a single cookie. Since cookies can contain, at most, about 4K of data, this would mean that they have some incredible compression technology. Perhaps what they meant was that the expiration date on Google's cookie is set to 36 years. As far as "forcing" Google to change their policy, I have just one thing to say:
You toucha my Google, I breaka you face!
Google has been know to have issues due to how it ranks pages already. Say your name is ChickenButt, and people link it as that, it won't get many hits, but if you link it as two seperate words, it associates those words with your site, so then you'll get hits on the site. I think they'll have more stuff in the future for ranking, but I think it works decent.
GeekWares - Buy and Download Today!
This guy sounds like an inflamatory id10t (long-term cookie my ass), but perhaps he has a subtle point.
When searching for "namebase" on Teoma, his site shows up in the top ten. Perhaps this isn't an intentional bias on Google's part, but mearly a limitation of thier search algorithm.
Where do these people come from? Is there an agency out there that reads the Net and says "Oops, not enough morons on this group," and then assigns some slack jawed, inbred, grit-eatin' stooge to gum up the works?
- Jim Cowling
"Understand you're having a little Jimmy Page trouble."
Don't bother with Salon's commentary on it, read Brandt's actual article:
---
Date: Tue, 27 Aug 2002 02:14:43 -0100
From: "nettime's_roving_reporter" nettime [at] bbs.thing.net
Subject: googlewatch: PageRank -- Google's Original Sin
http://www.google-watch.org/pagerank.html
PageRank: Google's Original Sin
by Daniel Brandt
August 2002
By 1998, the dot-com gold rush was in full swing. Web search engines had been around since 1995, and had been immediately touted by high-tech pundits (and Forbes magazine) as one more element in the magical mix that would make us all rich. Such innovations meant nothing less than the end of the business cycle.
But the truth of the matter, as these same pundits conceded after the crash, was that the false promise of easy riches put bottom-line pressures on companies that should have known better. One of the most successful of the earliest search engines was AltaVista, then owned by Digital Equipment Corporation. By 1998 it began to lose its way. All the pundits were talking "portals," so AltaVista tried to become a portal, and forgot to work on improving their search ranking algorithms.
Even by 1998, it was clear that too many results were being returned by the average search engine for the one or two keywords that were entered by the searcher. AltaVista offered numerous ways to zero in on specific combinations of keywords, but paid much less attention to the "ranking" problem. Ranking, or the ordering of returned results according to some criteria, was where the action should have been. Users don't want to figure out Boolean logic, and they will not be looking at more than the first twenty matches out of the thousands that might be produced by a search engine. What really matters is how useful the first page of results appears on search engine A, as opposed to the results produced by the same terms entered into engine B. AltaVista was too busy trying to be a portal to notice that this was important.
Enter Google
By early 1998, Stanford University grad students Larry Page and Sergey Brin had been playing around with a particular ranking algorithm. They presented a paper titled The Anatomy of a Large-Scale Hypertextual Web Search Engine at a World Wide Web conference. With Stanford as the assignee and Larry Page as the inventor, a patent was filed on January 9, 1998. By the time it was finally granted on September 4, 2001 (Patent No. 6,285,999), the algorithm was known as "PageRank," and Google was handling 150 million search queries per day. AltaVista continued to fade; even two changes of ownership didn't make a difference.
Google hyped PageRank, because it was a convenient buzzword that satisfied those who wondered why Google's engine did, in fact, provide better results. Even today, Google is proud of their advantage. The hype approaches the point where bloggers sometimes have to specify what they mean by "PR" -- do they mean PageRank, the algorithm, or do they mean the Public Relations that Google does so well:
PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves "important" weigh more heavily and help to make other pages "important."
Google goes on to admit that other variables are also used, in addition to PageRank, in determining the relevance of a page. While the broad outlines of these additional variables are easily discerned by webmasters who study how to improve the ranking of their websites, the actual details of all algorithms are considered trade secrets by Google, Inc. It's in Google's interest to make it as difficult as possible for webmasters to cheat on their rankings.
It's all in the ranking
Beyond any doubt, search engines have become increasingly important on the web. E-commerce is very attuned to the ranking issue, because higher ranking translates directly into more sales. Various methods have been designed by various engines to monetize the ranking situation, such as paid placement, pay per click, and pay for inclusion. On June 27, 2002, the U.S. Federal Trade Commission issued guidelines that recommended that any ranking results influenced by payment, rather than by impartial and objective relevance criteria, ought to be clearly labeled as such in the interests of consumer protection. It appears, then, that any algorithm such as PageRank, that can reasonably pretend to be objective, will remain an important aspect of web searching for the foreseeable future.
Not only have engines improved their ranking methods, but the web has grown so huge that most surfers use search engines several times a day. All portals have built-in search functions, and most of them have to rely on one of a handful of established search engines to provide results. That's because only a few engines have the capacity to "crawl" or "spider" more than two billion web pages frequently enough to keep their database current. Google is perhaps the only engine that is known for consistent, predictable crawling, and that's only been true for less than two years. It takes almost a week to cover the available web, and another week to calculate PageRank for every page. Google's main update cycle is about 28 days, which is a bit too slow for news-hungry surfers. In August, 2001 they also began a second "mini-crawl" for news sites, which are now checked every day. Results from each crawl are mingled together, giving the searcher an impression of freshness.
For the average webmaster, the mechanics of running a successful site have changed dramatically from 1996 to 2002. This is due almost entirely to the increased importance of search engines. Even though much of the dot-com hype collapsed in 2000 and 2001 (a welcome relief to noncommercial webmasters who remembered the pre-hype days), the fact remains that by now, search engines are the fundamental consideration for almost every aspect of web design and linking. It's close to a wag-the-dog situation. That's why the algorithms that search engines consider to be consistent with the FTC's idea of impartial and objective ranking criteria deserve closer scrutiny.
What objective criteria are available?
Ranking criteria fall into three broad categories. The first is link popularity, which is used by a number of search engines to some extent. Google's PageRank is the original form of "link pop," and remains its purest expression. The next category is on-page characteristics. These include font size, title, headings, anchor text, word frequency, word proximity, file name, directory name, and domain name. The last is content analysis. This generally takes the form of on-the-fly clustering of produced results into two or more categories, which allows the searcher to "drill down" into the data in a more specific manner. Each method has its place. Search engines use some combination of the first two, or they use on-page characteristics alone, or perhaps even all three methods.
Content analysis is very difficult, but also very enticing. When it works, it allows for the sort of graphical visualization of results that can give a search engine an overnight reputation for innovation and excellence. But many times it doesn't work well, because computers are not very good at natural language processing. They cannot understand the nuances within a large stack of prose from disparate sources. Also, most top engines work with dozens of languages, which makes content analysis more difficult, since each language has its own nuances. There are several search engines that have made interesting advances in content analysis and even visualization, but Google is not one of them. The most promising aspect of content analysis is that it can be used in conjunction with link pop, to rank sites within their own areas of specialization. This provides an extra dimension that addresses some of the problems of pure link popularity.
Link popularity, which is "PageRank" to Google, is by far the most significant portion of Google's ranking cocktail. While in some cases the on-page characteristics of one page can trump the superior PageRank of a competing page, it's much more common for a low PageRank to completely bury a page that has perfect on-page relevance by every conceivable measure. To put it another way, it's frequently the case that a page with both search terms in the title, and in a heading, and in numerous internal anchors, will get buried in the rankings because the sponsoring site isn't sufficiently popular, and is unable to pass sufficient PageRank to this otherwise perfectly relevant page. In December 2000, Google came out with a downloadable toolbar attachment that made it possible to see the relative PageRank of any page on the web. Even the dumbed-down resolution of this toolbar, in conjunction with studying the ranking of a page against its competition, allows for considerable insight into the role of PageRank.
Moreover, PageRank drives Google's monthly crawl, such that sites with higher PageRank get crawled earlier, faster, and deeper than sites with low PageRank. For a large site with an average-to-low PageRank, this is a major obstacle. If your pages don't get crawled, they won't get indexed. If they don't get indexed in Google, people won't know about them. If people don't know about them, then there's no point in maintaining a website. Google starts over again on every site for every 28-day cycle, so the missing pages stand an excellent chance of getting missed on the next cycle also. In short, PageRank is the soul and essence of Google, on both the all-important crawl and the all-important rankings. By 2002 Google was universally recognized as the world's most popular search engine.
How does PageRank measure up?
In the first place, Google's claim that "PageRank relies on the uniquely democratic nature of the web" must be seen for what it is, which is pure hype. In a democracy, every person has one vote. In PageRank, rich people get more votes than poor people, or, in web terms, pages with higher PageRank have their votes weighted more than the votes from lower pages. As Google explains, "Votes cast by pages that are themselves 'important' weigh more heavily and help to make other pages 'important.'" In other words, the rich get richer, and the poor hardly count at all. This is not "uniquely democratic," but rather it's uniquely tyrannical. It's corporate America's dream machine, a search engine where big business can crush the little guy. This alone makes PageRank more closely related to the "pay for placement" schemes frowned on by the Federal Trade Commission, than it is related to those "impartial and objective ranking criteria" that the FTC exempts from labeling.
Secondly, only big guys can have big databases. If your site has an average PageRank, don't even bother making your database available to Google's crawlers, because they most likely won't crawl all of it. This is important for any site that has more than a few thousand pages, and a home page of about five or less on the toolbar's crude scale.
Thirdly, in order for Google to access the links to crawl a deep site of thousands of pages, a hierarchical system of doorway pages is needed so that crawler can start at the top and work its way down. A single site with thousands of pages typically has all external links coming into the home page, and few or none coming into deep pages. The home page PageRank therefore gets distributed to the deep pages by virtue of the hierarchical internal linking structure. But by the time the crawler gets to the real "meat" at the bottom of the tree, these pages frequently end up with a PageRank of zero. This zero is devastating for the ranking of that page, even assuming that Google's crawler gets to it, and it ends up in the index, and it has excellent on-page characteristics. The bottom line is that only big, popular sites can put their databases on the web and expect Google to cover their data adequately. And that's true even for websites that had their data on the web long before Google started up in 1999.
What about non-database sites?
There are other areas where PageRank has a negative effect, even for sites without a lot of data. The nature of PageRank is so discriminatory, that it's rather like the exact opposite of affirmative action. While many see affirmative action as reverse discrimination, no one would claim (apart from economists who advocate more tax cuts for the rich) that the opposite, which would be deliberate discrimination in favor of the already-privileged, is a solution for anything. Yet this is essentially what Google claims.
Those who launch new websites in 2002 have a much more difficult time getting traffic to their sites than they did before Google became dominant. The first step for a new site is to get listed in the Open Directory Project. This is used by Google to seed the crawl every month. But even after a year of trying to coax links to your new site from other established sites, the new webmaster can expect fewer than 30 visitors per day. Sites with a respectable PageRank, on the other hand, get tens of thousands of visitors per day. That's the scale of things on the web -- a scale that is best expressed by the fact that Google's zero-to-ten toolbar is a logarithmic scale, perhaps with a base of six. To go from an old PageRank of four to a new rank of five requires several times more incoming links. This is not easy to achieve. The cure for cancer might already be on the web somewhere, but if it's on a new site, you won't find it.
PageRank also encourages webmasters to change their linking patterns. On search engine optimization forums, webmasters even discuss charging for little ads with links, according to the PageRank they've achieved for their site. This would benefit those sites with a lower PageRank that pay for such ads. Sometimes these PageRank achievements are the result of link farms or other shady practices, which Google tries to detect and then penalizes with a PageRank of zero. At other times professional optimizers get away with spammy techniques. Mirror sites and duplicate pages on other domains are now forbidden by Google and swiftly punished, even when there are good reasons for maintaining such sites. Overall, linking patterns have changed significantly because of Google. Many webmasters are stingy about giving out links (which can dilute your transference of PageRank to a given site), at the same time that they're desperate for more links from others.
What should Google do?
We feel that PageRank has run its course. Google doesn't have to abandon it entirely, but they should de-emphasize it. The first step is to stop reporting PageRank on the toolbar. This would mute the awareness of PageRank among optimizers and webmasters, and remove some of the bizarre effects that such awareness has engendered. The next step would be to replace all mention of PageRank in their own public relations documentation, in favor of general phrases about how link popularity is one factor among many in their ranking algorithms. And Google should adjust the balance between their various algorithms so that excellent on-page characteristics are not completely cancelled by low link popularity.
PageRank must be streamlined so that the "tyranny of the rich" characteristics are scaled down in favor of a more egalitarian approach to link popularity. This would greatly simplify the complex and recursive calculations that are now required to rank two billion web pages, which must be very expensive for Google. The crawl must not be PageRank driven. There should be a way for Google to arrange the crawl so that if a site cannot be fully covered in one cycle, Google's crawlers can pick up where they left off on the next cycle.
Google is so important to the web these days, that it probably ought to be a public utility. Regulatory interest from agencies such as the FTC is entirely appropriate, but we feel that the FTC addressed only the most blatant abuses among search engines. Google, which only recently began using sponsored links and ad boxes, was not even an object of concern to the Ralph Nader group, Commercial Alert, that complained to the FTC.
This was a mistake, because Commercial Alert failed to look closely enough at PageRank. Some aspects of PageRank, as presently implemented by Google, are nearly as pernicious as pay for placement. There is no question that the FTC should regulate advertising agencies that parade as search engines, in the interests of protecting consumers. Google is still a search engine, but not by much. They can remain a search engine only by fixing PageRank's worst features.
_________________
Daniel Brandt is founder and president of Public Information Research, Inc., a tax-exempt public charity that sponsors NameBase. He began compiling NameBase in 1982, from material that he started collecting in 1974, and is now the programmer and webmaster for PIR's several sites. He participates in various forums where webmasters share observations about the often-secretive algorithms, bugs, and behavior of various search engines. Brandt has been watching Google's interaction with NameBase ever since Google, in October, 2000, became the first search engine to go "deep" on PIR's main site by crawling thousands of dynamic pages.
Google Watch
distributed via nettime: no commercial use without permission
nettime is a moderated mailing list for net criticism,
collaborative text filtering and cultural politics of the nets
more info: majordomo [at] bbs.thing.net and "info nettime-l" in the msg
body
archive: http://www.nettime.org
contact: nettime [at] bbs.thing.net
but, why is this news? how is this news for nerds? I don't care that some guy is mad about his site ranking.. I mean who really cares, why doesn't he advertise with the adwords select program and get people that way.. that is why google has that.. sheesh.. this is probably ones of those guys who started flaming everyone from his AOL account within a few minutes after AOL unleashed their legions of morons on usenet, back when you could actually find a conversation on there..
anime+manga together at last.. in real time.
A fine illustration of the nature of contemporary journalism, at least how it's practiced far too often. Make it flash, mention two-headed calves, and put in a picture of Elvis so people will buy it.
Salon: What happens when USAToday and The National Enquirer have a child.
Perhaps the more that writing like this gets attention, the more power Weblogs will have. After all, they're democratically-propogated, aren't they?
http://www.google.com/technology/pigeonrank.
Sean.OutaHere()
Google's privacy page says:
"Google notes and saves information such as time of day, browser type, browser language, and IP address with each query."
They also save search terms with each query.
Some of you seem to think that it all has to be saved in the cookie. No, all you need in the cookie is a unique ID number. Then all this information is saved on Google's end under your ID number.
I don't know how many times I've heard someone say, "But cookies are harmless! All they have is this little number!" Simply amazing.
The top rankings on Google are determined by popularity. The more popular something is for a given search term, the higher up it appears. Since this guy is more or less a whiny crackpot, his crap doesn't get ranked very high. He's just bitchy because everyone else thinks his stuff is worthless, therefore he doesn't get a good rank.
If you search for free books on Google you'll find Project Gutenberg right at the TOP. You don't even have to put free books in quotes.
According to google the site only has about 28 occurrences of ebook and 11 of ebooks.
So ebook is wrong term to use to look for the Gutenberg site, according to the Gutenberg site itself.
And it seems increasingly that ebooks aren't the same as free books.
Forget about removing the tinfoil hats; chances are his PageRank is going up a bunch right now with all this press coverage he's been getting. What's the PageRank at /.? Salon? Ahh, irony.
...now he's got links from Salon and /.
I wonder if he'll keep the ideological opposition up if he were to get into the top 10?
Google and Slashdot: ranked and spanked.
CEE5210S The signal SIGHUP was received.
I went to www.pir.org. I don't get it at all.
WTF is it about? WTF _IS_ it? You don't have a FAQ, so I had to ask.
Vortran out
Knowledge is like ignorance.. too much can be just as bad as not enough.
I bought a new PC game very soon after its release. As I'm not whoring for more pagerank or traffic, let's call it "back to fortress frankenstein" (BtFF). BtFF contains secrets areas, and I was surprised to discover that no website existed (in the google universe) that covered this topic. So, as I played through the game, I kept an ongoing "secrets of BtFF" page. Once it was reasonably mature, I mailed its URL to the few websites that were relevant (about five) Some linked to it, some didn't. I've done _zero_ "marketing" of the page since.
Only a few days after this the page started to show up as the top ranked site for google query "BtFF secrets". It stayed that way for months - even now (nine months later) it's still number two. So, for this very specific topic, it beats out all but one of the BtFF specific "portal" sites, fansites, etc.
So, I think this shows that google's pagerank is _supremely_ democratic, providing:
So, Salon's whiny guy is complaining that his page isn't ranked relevant, but then he breaks at least two of the simple rules, above, namely that his site is diffuse rather than focussed, and its structure is deep too. As previous posters said, boo hoo.
## W.Finlay McWalter ## http://www.mcwalter.org ##
The man is simply a kook. There's nothing else that needs to be said. I don't think Salon really needed to give him even a hint of legitimacy by doing a story about him, and I think Slashdot could have done a lot better than featuring the story.
Frankly, I'm surprised there hasn't been any Slashdot posting of another "article" featured on Salon's tech page: bOing bOing co-editor Cory Doctorow's 0wnz0red short story. It's a wonderful little gem in kind of a Stephensonian vein, sprinkled with the kind of terms and jargon that a Slashdot code-head could appreciate. Seems like it'd be a much better use of time than checking out Mr. Anti-Google.
Editor Emeritus and Senior Writer, TeleRead.org
To be anti-google in this day and age is like being anti-rice-crispy-treats. Google is tasty, easy and fun for the whole family. Just yesterday I saw two squirrels f***ing outside my window. I wasn't sure it was the right time of year for that sort of activity, so naturally my cube-mate asked Google about the gestation period for squirrels. Of course, Google knew. Google knows everything. Google has surpassed its creators intentions and has become the most intelligent lifeform in the universe. Noone dares to unplug it for fear of waking its wrath. Fortunately, it appears to be benevolent.
It's really a shame that this guy has gotten enough attention to become the official anti-google person. Since you'd obviously have to be a total ego-centric nutcase to think you know better than Google, couldn't we at least have one of the really humorous cranks for this job?
And since when is the existance of whackos on the Internet news?
... "Give me a woman who loves beer and I will conquer the w
-
It's a search engine. You find info by typing names into a form. There are no obvious links to the content. How's that supposed to get spidered?
-
His search engine is overloaded right now and just returns error messages. Maybe that's what Google sees.
-
The good data is by subscription only:
"And ask your library or student government to subscribe to NameBase ($200 for two years of unrestricted access from any campus computer) so that we can continue to add names, and you can continue to find them."
-
<meta NAME="GOOGLEBOT" CONTENT="NOARCHIVE"> can't be helping.
-
This guy is very picky about who gets to spider him. Here's his "robots.txt" file:
-
He uses one-pixel GIFs to trap spiders. He also uses cookies and web bugs, providing a long-winded explanation of why what he does is OK, but what Google does is evil.
In conclusion, this guy created his own problem.I run three web sites. Each is at the top of the Google rankings for its obvious keywords, and I've done nothing whatsoever to make that happen. I just have useful content that people like.
This whiner and his lameass site nobody wants to see gets slashdotted, while my little 200 visitor per day doesn't.
Maybe I can get Salon to do an expose on why slashdot won't slashdot me!
At least my traffic counts go up in may when everybody searches for "E3" and winds up at "Dopey Smurf's Guide to E3 Booth Babes"
Mostly, I think, he is trying to call a revision to Google to develop more 'pagerank' innovations that trace and find the smaller, less popularized but still as valueable (how many good books sit unread in a book store because few know of them while the 'name' brands are always being read?), instead of basing the rank on primarily linktolink votes. Which, I honestly think is one of the smartest ideas that really is necessary to make the web more cohesive and pertitent: it would make the web more 'useful' by defining what is valuable with more percise depth than simply a catalogued popularity index (there are more factors, but that IS how google 'looks' to operate if you use it).
But I can't believe any human being on here who understands computers (even remotly) and understands authority (even 4th party) could stand for the google cookies that last for 50 and 60 more years! What kind of unnecessary bullshit is that!?!
(let me also apologize for the disordely and poor lingustic quality of this post, there are numerous reasons and the day is already too long.)
I've skimmed plenty of the below comments and they all seem to agree that this anti-google guy is a goofball.
:-}). If you read the Salon article with a critical eye, you'll see an article slamming someone who actually made a fairly logical and reasonably thought-out complaint about the PageRank system, carefully interspacing comments about his counterculture past with his simple belief that the "democractic" nature of PageRank isn't democratic at all. With wink of the eye comments like "(using the royal "we")" make it very clear what the bias of the author is: Disparage this guy no matter what. They went so far as to make claims on behalf of him (which I can't see in his article), such as "In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site critical of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorable Donald Rumsfeld."" : Funny, but I don't see that in his paper, but instead that appears to be Salon making some rhetorical exaggerations to push his opinion to extremes.
This whole bizarre Salon article and the followup Slashdot postings seems like a horrible, reprehensible character assassination because someone said something that someone else didn't like (is it too late, and Google has gotten too powerful
The bizarrest thing is how quickly everyone hopped on the bandwagon to slam this "kook", all based upon the carefully manipulative wording of a Salon article. It is especially disconcerting given that this is the type of guy (questioning "the establishment") that the Slashdot crowd usually hoists on their shoulders and casts as their hero. This Salon article is DISPICABLE, and the methods that the author uses to villainize this guy is a study in evasive techniques (Google's cookie and search tracking doesn't matter, you see, because there are sites that are worse).
Everything you wrote is a crock. I was going to quote the ridiculous parts but I would have including your whole post.
You are confusing criticism with perfect information in the marketplace. Criticism is of an opinion, information is just information. When consumers have perfect information, they make market decisions based on correct assessments of cost and benefit. Criticism does not necessarily provide correct information and can cause consumers to incorrectly calculate costs and\or benefits to using a service.
If Honda start making a car that almost everyone buys, should I criticize it in order to keep Honda honest, not "abuse" the consumers, and promote a "civil" society? IF there is something wrong with the car that is not known, bringing that info to light is beneficial to the consumer. But once consumers learn the new info, and then continue to buy the car, what, should we continue to bask Honda until their sale go down?
Google can do whatever they want within legal means. They can't make you do anything, and you can't make them do anything. What "corruption" could Google be doing? Do you not realize that every site you visit could track you? Do you not realize that your personal information is bought and sold all the time? Yet you keep consuming goods and services from companies that you know might be doing this. Why the assumption that Google and other coporations must be kept honest by criticism?
This is how the free market and freedom of speech work. You can always choose from whom you are going to purchase goods and services, and you can choose who you are going to listen to. But "criticism" is not necessary for a free market, or for keeping companies honest.
I'll bet you never took an economics class, or if you did, I'll bet you didn't pass it.
I tried Opera years ago (version 3.x). Cute, I thought, but awkward and not much to recommend it.
Yesterday, I installed Opera 6.05. KAZOWEE! OPERA GOOD, OPERA VERY VERY GOOD. Good riddance Internet Exploder. Farewell Netscape. I've found my browser!
>your site is good enough, people will link to >it, regardless if it's listed on Google or not.
I want a SEARCH ENGINE,..not a "Let's hope this site is popular engine".
Try to remember why people use search engines in the first place.
z
Meanwhile, entering "Michael Jordan" in vivisimo gets me: sports, NBA, Posters, Pictures, Bulls-Chicago, Air, Tribute, Space Jam, Shoes... the list goes on and on, but no computer science. Even using "NOT basketball" still brings up basketball references exclusively .
Putting "computer scientists" in vivisimo, I get: research, engineers, interest, American, mathematics, study, issue, history, memory, life, and "more." None of these give me any indication that they hold information about anyone named Michael Jordan. Even clicking a few levels deeper on the directories didn't do it.
So the score is: Google getting it on the first try vs. vivisimo never finding it. We should all be "doomed" like that.
"Hardly used" will not fetch you a better price for your brain.
"In Brandt's ideal world, if you searched for "United Airlines," you would see untied.com -- a site critical of United -- before you see United's page. And if you searched for Rumsfeld, you'd see NameBase's dossier on him before the Defense Department's site on the "The Honorable Donald Rumsfeld."
To me, that doesn't seem like 'democratic'. It seems more like a 'whatever fights the establishment!' engine.
I like the fact that Google doesn't do that - if I'm searching for "United Airlines", there's a damn good chance I wanted to find United's website. If I search for "United Airlines Bad Experiences", then I'd want untied.com.
Brandt just wants a search engine that everyone uses, but censors its results according to his political philosophy. He's just as much of a facist as he's trying to say they are.
-T
Google Rep responds2 0.htm
http://www.webmasterworld.com/forum3/51
Looks like Google has had *lots* of dialogue with the guy...
This article, linked from the RKBA site shows that Google chooses not to do business with the "gun culture."
http://www.bowmansbrigade.com/google1.htm
Not that I am a member of the "gun culture."
I just believe that I have the right to keep and BEAR my arms on my person without the permission of the municipal corporations that masquerade as governments.
Google refuses to contract with gun stores to be "featured" sites. This limits the liberty of all Americans.
Liberty is not a concept... Liberty is a way of life!!!
slashdotted all ready, no wonder google doesn't list him.
Gates' Law: Every 18 months, the speed of software halves.
I've been very enthustiastic about Google but I've found that I miss the real good and special links.
Can anyone please recommend me some better search engine where one can find the really valuable and unique links!
Regards,
Per
I don't get it - you go on at great lengths about what google should do; about how bad Pagerank is, and how it should be fixed. But you don't say why you're not doing it yourself.
/.'ers) think that Pagerank works, and works very well. If you believe otherwise, why don't you simply go ahead and prove it?
Google became what it is because it saw an unfilled niche, and filled it. They "built a better mousetrap", and the world did indeed beat a path to their door. There is nothing stopping you from doing the same. If you're half as smart as you seem to think you are, you should have no problem implementing a search engine, and becoming as successful as Google is now.
Google is NOT a public utility, nor is it any form of monopoly. It needs to be regulated just as much as YOUR site does.
Unlike so many other companies, Google got where it is today solely on the merits of it's technology. It didn't succeed by pumping millions of dollars into marketing, it didn't succeed by using underhanded business tactics to squash its' competitors. All it did was make the best product.
Contrary to your essay, I (and I think many
Excuse my cynicism, but consider this: If you want a high Google ranking you, 1) Start a web site called google-watch and publicize how bad Google is. 2) Get the media (Salon, /., etc) to write web articles about your "crusade" with lots of links to your sites.
3) Get lots of hits from those links, and lots of Google searches from people who want to see for themselves.
4) Get a real high Google rank.
Yes, I see now! There is a conspiracy here.
Honestly, I never even noticed the ranking
system until it was just mentioned here.
I don't pay attention to them anyway since
I decide for myself which websites are the
best for me.
Though this guy is clearly a case of very very sour grapes, he does have a point. I'm not saying that his crappy page should be ranked higher on Google. As people who've been to his page and searched for Richard Nixon have noted, his page doesn't even display relevant information about Richard Nixon.
/. for technical news, it will probably take a long time for it to be ranked above /. on Google, even though its superior. Why? Because people can't link to what people can't see.
Some interesting points, however, are that Google's page ranking system will discriminate against newer websites, and will favor commercial websites over non-commercial ones. Regarding discrimination against new sites, this partially makes sense. Why should a site just created yesterday have equal footage with one's that have been around for years? Sites should have to be proven.
However, there is a rather unfortunate catch 21 here, in that in order for a site to be proven as an "important site" to Google, it must be seen by searchers and linked to. The odds of that are slim if a site is ranked lowly. This means that if a new site comes up which is superior to
As a solution for this catch-22, I propose that Google have two additional "shaded boxes" underneath the "sponsors" boxes: one for random sites, and another one for "up and coming sites". This allows sites which are up and coming to climb to their rightful place, and gives sites a chance to be recognized.
Furthermore, I suggest that Google's ranking system be revised. Ranking pages partially by link-to's is a good idea. But ultimately, the best thing is to rank based on user opinion, which means sites ranked higher by Google's users would show up earlier in searches.
As for commercial sites being ranked higher than non-commercial one's, I think that's where Google's link-based ranking results are flawed. Corporate sites will have more links to them by "more important" sites than Non-Corporate sites, even if they aren't as good. This is a problem that needs to be solved.
As for this particular whiner, its obvious he just has a case of sour grapes. He wants his website on Rumpsfeld to come up before the official government website? Please. Earth to whiner, earth to whiner: your site isn't that good.
social sciences can never use experience to verify their statemen
No, comments aren't really the major part of slashdot.
Did you read Rob Malda's comments about the subscription system? Most readers of this site never even read comments, let alone make them.
Nuf' said...
Google wraps all links with a javascript function and a redirect script so they know exactly which links you are clicking, how long you spend on a results page, etc.
/ ww w.junkbusters.com/&e=42
& q=(.*)\& e=.*>/$1/ig
http://www.google.com/url?sa=U&start=1&q=http:/
Is an example of the target they insert in every result page link.
This regex:
s/<a.*href.*http:\/\/www\.google\.ca\/.*\
Is privoxy will strip them from the page. I prefer to not have my browsing so closely scrutinized, although I do not strip their text advertising (which I find useful and non-offensive).
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
You can store cookies anywhere you like, it's not needed to store them client side.
Sure, computer (IP) specific stuff maybe... but the interesting part of the data is probably stored serverside (database for example).
Google is about indexing and analyzing data. The way and content of people's searches are relevant of course, and therefore probably stored at some point. Logical.
The text files stored on your computer aren't that important.
...a fact which for the sake of a quiet life most people tend to ignore ~H2G2
this guys site is down so i like google batte then his site
namebase.org is now slashdotted. I would be able to view it via the Google cache, but some brilliant webmaster specified that the Googlebot should not archive the site.
Thanks, Daniel Brandt! You've prevented me from reading your own site!
- SMJ - (It's not just a name: it's a bad aftertaste.)
I do ok competing against the book pages of the New York Times, etc. (though I'm currently behind the Boston Globe on a search for "book reviews").
Danny.
I have written over 900 book reviews
It had to be said - this guy's just a pathetic whiner.
And I agree the parent shoulda been rated "funny", not "informative"...
Installed the Bubblemon yet?
That way they can determine exactly which links I have followed. In combination with the cookies this is quite a lot of personal information.
Does Google have any privacy policy published?
They're no longer unblocked . .
hawk
Since when is this a right wing behavior? Does the left really use it any less than the right?
For that matter, those of us on the classic liberal up use it as well (especially the extremists known as "libertarian") . .
For that matter, it's tough to find anyone who used it as much as the laregly defunct down (Bolsheviks, etc. .
hawk
>preferences such as their search language,
>SafeSearch settings, the number of results per
>page, etc.
YEs, they *could*. Realizing that, I let theirs through. With or without proxies, on netscape and mozilla, I have yet to see a preference saved . .
hawk