Traversing the "Googlearchy"

Self-reenforcing cycle? by lkypnk · 2006-08-17 10:40 · Score: 3, Insightful

I've got to say no to this. Yes, when you search for something, you get the most popular results. But not everyone uses the same search terms, and even if you only go for the first three pages of results, you've still got 20 - 30 different sources of information, each different but similar query returning a slightly different set.

Re:Self-reenforcing cycle? by DerekLyons · 2006-08-17 11:16 · Score: 4, Insightful

I've got to say no to this.

And your evidence and study showing the researchers are wrong is... what?

Yes, when you search for something, you get the most popular results. But not everyone uses the same search terms,

Actually, if you've ever watched those 'live search' services (I.E. showing in realtime search terms users are entering), you'll see the same terms pop up again and again. Equally, for most search items - there simply are not that many (properly spelled) variants. (I.E. for the Seattle Mariners - there's pretty much only one way to type that.)

and even if you only go for the first three pages of results, you've still got 20 - 30 different sources of information, each different but similar query returning a slightly different set.
Many studies have found that the first page is what it's all about - what's on page 4 might as well not even exist. (There's a reason why SEO's exist you know.)

In essence - your claim that the researchers in TFA are wrong is based on smoke and mirrors.
Re:Self-reenforcing cycle? by 70Bang · 2006-08-17 12:42 · Score: 4, Informative

It has nothing to do with what the search engines do or provide per se. Search engines aren't always needed to a certain extent any more, particularly when it comes to popular sites, specific uris, etc. The reason (IMO)?

Word of "mouth". Actually, email messages[1] are sending names of services or specific uris for a particular site (e.g., something particularly funny on youtube) and people are pointing their browser in that direction, then exploring what else is there. If there are uris to other locations on the web, people follow those. One of the local affiliates in Indy played a considerably portion of this last night and made sure everyone knew there was a link on their web site. Lots of people likely pointed their browsers and youtube had a lot of extra traffic[2]. On the youtube page is Explore other videos. Lots of information conveyed, but no search engine activity in the process.
The web has enough toys^w services which people regularly visit (e.g., blogs, youtube) they don't necessarily need search engines unless somethings isn't found via the normal means. And normal now includes the various discussion forums where people provide the advice from the voice of context. IMDb.com has a professional side (reasonably priced paid service) where people who are in the biz can post things they're looking for or are available for. A couple of nights ago, someone was asking about the best software for scriptwriting on a small budget. ca. eight people chimed in with what they knew about different packages, including a couple of free ones, a commercial one for $25, a template which can be downloaded for MS Word, and some of the pros & cons about the ones they'd used. Where will you find ad hoc information in that context on demand in a search engine?
__________________________________

[1] Unless you're in the media and use "emails" as a noun.

[2] Several years ago, I had a client who helped small to medium newspapers get online. Someone build a web site for them (taking six months, #include files nested six deep, every call to the server required 20'000 lines of code to be processed, regardless of the function involved. Once more than twenty people hit a site, the server showed you its impression of the La Brea tar pits. One site for a reasonably small city, perhaps a handful of a thousand people had a sheriff's deputy arrested for pedophilia, a ten-car pileup on the nearby interstate, and the largest employer (a substantial percentage of the citizenry) was going to be dismissed. All of this hit CNN with a reference to their newspaper's web site. That's about the time Chrnobyl and Three Mile Island happened at the same time. Fortunately smarter people are starting to anticipate resource issues a little better than they used to.
Re:Self-reenforcing cycle? by cheese-cube · 2006-08-17 12:45 · Score: 1

not everyone uses the same search terms
No, no they don't.
Re:Self-reenforcing cycle? by Anonymous Coward · 2006-08-17 18:10 · Score: 0

But on the other hand, even if said googlearchy does exist, this means that the tried-and-tested sites are going to get high results, which really is the idea...separate the useful from the useless, which, by the way, is a large chunk of the internet.

Which means we should all be using wikis so that any new howtos, etc, which end up superseding tried-and-tested ones are just there at the existing URL, as opposed to being a new search result.
Re:Self-reenforcing cycle? by UnHolier+than+ever · 2006-08-17 20:12 · Score: 2, Interesting

for the Seattle Mariners - there's pretty much only one way to type that.

Yeah, right, Ciatel cannot be written any other way. Like every other word of the english language, Siahtel is the perfect example of a uniquely constructed word. Whether you live in See-attel or do not live in Sea-atle, the correct spelling of Seateul is obvious.
Re:Self-reenforcing cycle? by Bob+Uhl · 2006-08-18 02:34 · Score: 1

...for the Seattle Mariners - there's pretty much only one way to type that.

You grossly over-estimate the intelligence of the average sports fan. No doubt that's the least common spelling. I daresay the most common is something along the lines of 'Seeaddle Mayrnurs.'
Re:Self-reenforcing cycle? by fyngyrz · 2006-08-18 03:23 · Score: 2, Informative

This survey's results make a great deal of sense to me. We write and market an extensive graphics package that in many ways surpasses Photoshop in capabilities. What we don't do is "market" it in the traditional sense (though you can find our zero-dollar "footprints" for instance my sig here, all over the web one way or another.) Adobe spends an amazing amount of money talking about Photoshop, compared to our zero dollars approach. We used to market like they did, back when we wrote Amiga graphics software. Sure, we sold a lot of product, but we spent so much on marketing that the actual gains after accounting for all the thrashing around with ads, reviewers, shows, distributors, dealers, packaging, brochures, mailings and so on weren't all that impressive. Classical marketing is expensive!
These days, we only sell direct and we have significant amounts of detail on the web — about 70 megabytes of docs, images and animations — the search engines, particularly Google, do a great job of finding us when people search out the types of graphics specific things our software does. We're astonishingly successful for the type of company we are today; I have absolutely no reason to complain, nor does anyone else who stuck it out with me all these years. We've been marketing online well prior to the advent of the web; we started doing it on Compuserve no less, and the more we did that, the better results we got. When the web came, our approach, which is put everything you can online and then some, began to work for us much better, the web being a richer environment than CIS ever was; and when search engines began to get reasonably smart, that pretty much taught us to kill our standard marketing. I don't regret it one bit (and by the way, I own the company outright, so that is the opinion of the company. :)
By comparison, every once in a while we get some press, though again, we don't actively seek it out. When that happens today, we see small spikes in sales. In the past, say, in the late 1980's, serious press (like a review) would make huge difference in sales for a month or so. My impression is that the impact of self-described "news" outlets has dropped in a big way since I began writing and selling software back in the late 1970's. I'd say they're essentially in the "don't count for much" category today, though I'm reasonably sure they'd offer a different opinion. :)
We do all of our business as a consequence of word of mouth and search engines. By all, I mean to say in the high 90th percentile. Not that we discourage anyone from reviewing, far from it... it's just that when you've got something specific, something technical, it's pretty much as the report says: people can find you, and they will. Another thing that helps is having truly unique content and capabilities; for instance if someone needs nondestructive geometric image manipulation, or morphing, they're going to find us, and quickly. If a site has "me-too" content that is duplicated in large part all over the web, I don't imagine the search engines would be all that useful, because now you're into trying to trick your site uprank with semantics, and while that may work for a while, the search engines are always mutating how they rank things and if you don't spend a heck of a lot of time on it, I could see any such effort sinking beneath the waves, as it were, within a relatively short time. We don 't spend any time manipulating our site's ccontents. We just write about what we offer, in as much detail as we can think of. If we think of something new, or need to make a correction, we certainly do that, but it isn't possible (or even reasonable) to try and fudge a site as large as ours based on today's particular version of how Google is going to rank things.
All in all, the results these researchers got fits in with my mental image of what the web "does" quite well. People who know what they are

--
I've fallen off your lawn, and I can't get up.
Re:Self-reenforcing cycle? by greenrd · 2006-08-18 06:43 · Score: 1

[1] Unless you're in the media and use "emails" as a noun.

Emails is a noun. The 1980s called, they want their grammar back.

--
Female Prison Rape in NY

What? by MyLongNickName · 2006-08-17 10:40 · Score: 0

Only 0.8? Roland will have to post an additional 25% more "stories" to get his blog rank up.

--
See my journal for slashdot ID's by year. Mine created in 2005. http://slashdot.org/journal/289875/slashdot-ids-by-year

Re:What? by Anonymous Coward · 2006-08-17 20:41 · Score: 0

^ Hihi.. funny post ^

direct by TheSHAD0W · 2006-08-17 10:40 · Score: 3, Insightful

It means people are finding what they're looking for more directly, rather than having to gad around. This is a good thing.

Factor by Anonymous Coward · 2006-08-17 10:41 · Score: 0

each inbound link only increases traffic by a factor of 0.8

So ten links would increase traffic by a factor of 0.8^10, or about 0.1. Doesn't anyone's maths education cause them to develop innumeracy filters any more? Could it possibly be that it's supposed to be "1.8"????

Re:Factor by Shaper_pmp · 2006-08-17 22:05 · Score: 1

increases traffic by a factor of 0.8

Maybe everyone's english education causes them to develop language-parsing filters instead? ;-p

It's technically phrased incorrectly, but the meaning is still clear from what they wrote. Your interpretation would mean traffic actually decreased, which is flatly contradicted by the statement.

TBH, the real problem I have is the idea that every additional inbound link could increase traffic by a constant factor. Isn't it saying that if I've got 100 inbound links and 100 users/day, getting one more link would get me an extra 80 users/day?

I think they meant that "the increase in traffic from each link was only 80% of what they expected from a linear relationship", not that "each inbound link increases traffic by 80%".

--
Everything in moderation, including moderation itself
Re:Factor by hauntingthunder · 2006-08-18 01:00 · Score: 1

Well as a practising SEO Not all links are equal a link from Slashdot or the BBC is worth way more than a link from my blog

--
You will never get to heaven with an Ak 47... But A Zu 30 is good for Low Flying Cherubim

i can see that by User+956 · 2006-08-17 10:45 · Score: 4, Interesting

In the end, it appears that each inbound link only increases traffic by a factor of 0.8. The results suggest that the reliance of web users on search engines is actually suppressing the impact of popularity.'

I can agree with that. I've seen users type "yahoo.com" into the search bar in firefox... which goes to the google search results page, where they then click on the "Yahoo!" link. It's almost as if users are conditioned to use "search" as their first action, regardless of whether they can remember the domain or not.

--
The theory of relativity doesn't work right in Arkansas.

Re:i can see that by fossa · 2006-08-17 10:52 · Score: 2, Insightful

I do it too. What's more difficult, an extra click, or a decision on which box to type in? And there are other cases where my cursor is in the search box, so clicking to the URL box and then typing is the same as typing and then clicking the search result...
Re:i can see that by XorNand · 2006-08-17 11:27 · Score: 3, Interesting

What you describe is actually *very* common for novice 'net users. In fact, I might say that more of them do it than don't. Check out AOL's recently released search data. Just randomly check out various users' search histories. It would be interesting to see how this correlates to the frequency of Google users doing the same thing.

--
Entrepreneur : (noun), French for "unemployed"
Re:i can see that by doobystew · 2006-08-18 02:03 · Score: 1

They do the opposite and type the url into the search box when they can do the opposite and type the site name into the url bar to be taken straight there.
Re:i can see that by Flying+Betty · 2006-08-18 06:00 · Score: 1

It's not just "novice" users who use Google in place of the address bar. I use the drop down list of typed addresses for websites that I visit regularly, and then Google for things that I don't want cluttering my address list.

That's the one thing that I like better in IE than Firefox: IE shows links in the order that you last used them rather than the order you last typed them in.

Trouble is... by Anonymous Coward · 2006-08-17 10:59 · Score: 0

I would actually be all for this, the trouble is just that it would force consumers to come in sealed plastic bags...

A factor of 0.8 decreases traffic by Anonymous Coward · 2006-08-17 11:05 · Score: 3, Insightful

A factor of 0.8 means that the traffic is decreased by each inbound link. Weird.

Re:A factor of 0.8 decreases traffic by Anonymous Coward · 2006-08-17 11:11 · Score: 0

Perhaps he meant 1.08?
Re:A factor of 0.8 decreases traffic by Anonymous Coward · 2006-08-17 11:18 · Score: 0

...increases traffic by a factor of 0.8.
This means that it is an increase of .8 over whatever was already there. Please don't try to be pedantic. It helps no one.
Re:A factor of 0.8 decreases traffic by sidney · 2006-08-17 11:31 · Score: 3, Informative

TFA says that it is a linear relationship with a slope of 0.8. They scaled the data so that a direct linear relationship would plot as a straight line with a slope of 1, which is a line going up at a 45 degree angle, hits increasing one unit for every one unit increase in incoming links.

Instead they saw a straight line with a slope of 0.8, meaning the hits increase 0.8 units for every 1 unit increase in incoming links. More links still correlate with more traffic, but, for example, doubling the number of incoming links increases the traffic by a factor of 1.6, not by a factor of 2.
Re:A factor of 0.8 decreases traffic by bluebox_rob · 2006-08-17 23:37 · Score: 1

Please don't try to be pedantic. It helps no one.

You must be new here...
Re:A factor of 0.8 decreases traffic by enharmonix · 2006-08-18 03:42 · Score: 1

More links still correlate with more traffic, but, for example, doubling the number of incoming links increases the traffic by a factor of 1.6, not by a factor of 2.

I was wondering what they meant by that. In that case, this is good news. Although I love Google's algorithm (it's much better than other search engines), I was worried about that higher page rank could result in more sites linking to the higher sites. That would make a feedback loop, and feedback loops too often produce over-amplified results (for example, slashdot has a page rank of 9/10 <rimshot />). Anyway, it is good news to see that the correlation is < 1. I would imagine a factor of 0 would yield "perfect" results, but a factor > 1 would make me question the usefulness of their page ranking system.

How are you defining popularity? by crazyjeremy · 2006-08-17 11:09 · Score: 4, Insightful

Maybe a site's popularity isn't defined by the number of inbound links because no matter how many links to your site you have, people still only want to look at things they are interested in. So by defining web popularity not by links, but as "Some internet item people want to find" that means that the more links to an individual site simply lets interested people find that site easier. It would only change the popularity if it's forced on you (like ads) and you become interested by a curious side thought... The more links to a site you have, the more likely interested people will find it.

--
Funnypics

This doesn't mean what it sounds like... by pla · 2006-08-17 11:14 · Score: 5, Insightful

The results suggest that the reliance of web users on search engines is actually suppressing the impact of popularity.

When I first read this summary, I thought, "WTF?". So I read the article. And re-read the summary. And re-read the article. And I think I finally "get" it.

Let's say you run a "popular" site like the BBC news. You get a hell of a lot of traffic, and people tend to go directly to your site rather than via a link. Alternately, you get a lot of links that only a small percent of people seeing them follow.

Now compare that with an unknown site (most personal or academic webpages, for example). They get very few visitors, but most of them come from search engines.

So what does this tell us?

Almost nothing we didn't already know - Search engines DO indeed negate the impact of popularity, because popularity has little to do with relevance, while search engines generally try to maximize relevance.

This I consider a "good" thing. When searching for info on ripping a DVD using the latest copy protection scheme, I don't care if the latest pop idol calls ripping "totally not cool". I want methods, programs, and real life examples that might only have gotten a few dozen hits ever.

Re:This doesn't mean what it sounds like... by Asm-Coder · 2006-08-17 12:34 · Score: 1

That makes perfect sense to me. If I had mod points, I'd give you one. BTW: What would the influence of easily remembered domain names be?
Re:This doesn't mean what it sounds like... by DogbertRulez · 2006-08-17 14:00 · Score: 1

Thanks for all the fish!
Re:This doesn't mean what it sounds like... by Anonymous Coward · 2006-08-17 15:22 · Score: 0

It seems that the authors have proven that search engines exist. This is great news for anyone who hasn't heard of google, assuming such a person could find the article in the first place.

Stating the obvious as if its not by davros-too · 2006-08-17 11:29 · Score: 1, Interesting

Sites with more links have more visitors (as defined by Alexa ranking, a rough tool at best) - big surprise , NOT. Everyone knows that sites with more inbound links tend to rank higher on the search engines and therefore get more visitors.

TFA then tries to make a big thing out of their 'discovery' that links are not the _only_ factor in the popularity (however defined) of a website. Again, completely obvious.

Then we hear that the correlation (not defined clearly) between links and 'traffic' (presumably actually some Alexa rank) is 0.8. Not clear what this actually means, but its hardly surprising the relationship between links and traffic isn't 1:1. Many factors will be causing this. For example, site-wide links off large sites make a huge contribution to the number of links but will make a smaller contribution to the target site's search engine ranking than the same number of links each from an individual site.

--
In theory, there's no difference between theory and practice; in practice there is.

Slashdot link? by complete+loony · 2006-08-17 11:39 · Score: 2, Insightful

I'm guessing that link up there in the summary had WAY more effect on their servers...

--
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.

Re:Time spent searching by Anonymous Coward · 2006-08-17 11:39 · Score: 1, Interesting

Bingo! When I worked in newspaper management, one metric in our readership studies was the amount of time each day the reader spent with the paper. Longer was considered better, as it indicated people were finding many things (articles, ads, crosswords, whatever) that warranted their time. The irony, however, was that one of the main points of the redesigns that were done every few years was to consolidate information (such as 'news at a glance' pages) and to make it easier for the reader to find what they were looking for. If done right, time spent with the paper would decrease -- which would show up in the next market study, and be considered a bad thing.

Please explain the factr 0.8 by houghi · 2006-08-17 11:45 · Score: 1

What do they mean by 'increse with a factor 0.8'?
If my startingpoint is 1 and I put a link on it, does it mean that I now have a 1.8 or a 0.8 or what?

Or do they mean 0.8%? So if I start with 100, I now have 100.8 per incoming link?
Are they cumalitive? e.g. is is the second link (if it is %) over the 100.8, or over 100?

Also it looks like captain obvious. Pages that have more links to them are more popular. Also that people who have intersts in certain pages will only go to those certain pages.

Now if only a searchengine company would realize that, they could use this data to get some advertisements on both their site and on the site they link to.

Oh, wait, they reversed-emgineerd Googles business plan.

--
Don't fight for your country, if your country does not fight for you.

A better paper for search engine bias by Anonymous Coward · 2006-08-17 12:28 · Score: 0

Just to mention that a better (maybe the origin of search bias, at least the earliest one I know) is as follows. Junghoo Cho, Sourashis Roy "Impact of Web Search Engines on Page Popularity." In Proceedings of the World-Wide Web Conference (WWW), May 2004.

I guess there should have been a discuss on that here for it's not NEW and there are quite a few papers on that.

Etymology Nazi by a+whoabot · 2006-08-17 12:33 · Score: 1

The Greek would definitely have a contracted eta for just "Googlarchy."

People don't need to "search" for popular sites by NotQuiteReal · 2006-08-17 12:51 · Score: 2, Insightful

Sites become popular the old fashioned way - by word of mouth... ok by email, but still.

Personally, I "search" for purchasing info, business info, etc.

I am told about "popular" sites directly... they are um, popular.

--
This issue is a bit more complicated than you think.

Re:People don't need to "search" for popular sites by crazyjeremy · 2006-08-17 13:39 · Score: 1

I agree completely. Should've included it in my post.

--
Funnypics

A.k.a. levelling the playing field by alienmole · 2006-08-17 12:52 · Score: 1

So what does this tell us? Almost nothing we didn't already know - Search engines DO indeed negate the impact of popularity, because popularity has little to do with relevance, while search engines generally try to maximize relevance.

If the story had described this as "search engines level the playing field, leading searchers to relevant sites even if they're not the most popular", it's doubtful that it would have made it to Slashdot. For their next project: "Snakes on a plane: how dangerous are they, really?"

Funky math by pablodiazgutierrez · 2006-08-17 13:11 · Score: 1

In the end, it appears that each inbound link only increases traffic by a factor of 0.8.

What does this mean? Without any other reference, I would assume that each link takes 1 unit of traffic (ut) to (1 + 0.8)ut. If so, n links would take your traffic to 1.8^n ut, which is unbelievable. What's missing here?

--

To do list for Windows

I hear somebody laughing by smartdreamer · 2006-08-17 13:13 · Score: 1

I hear somebody laugh at Google: "haha those ranking noobes did not understand anything."

# of inbound links == page rank?? by Anonymous Coward · 2006-08-17 14:36 · Score: 1, Informative

Yes, the core of Google's ranking algorithms is based on incoming links, but it is far from something as simple as just counting the number of links. The _quality_ of the links is way more important. In addition, there are many signals Google takes into account beyond just pure PageRank (if this wasn't true, almost anybody could build Google). Yet, TFA uses and interchanges "# of inbound links" and "search engine score" as if they meant the same thing.

If they really are using # of links as an approximation to search engine score, then they're flawed from the beginning. If they aren't, then somebody isn't very good at conveying information.

People who like this kind of thing... by NetSettler · 2006-08-17 15:34 · Score: 3, Insightful

It reminds me of the quote (not sure the origin): People who like this kind of thing will find that this is the kind of thing that they like.

You think it's bad now, imagine when Google has an AI model of what you want to find such that it tailors the search results for you alone.

Some years back, in the early 90's, I think, when there was little or no web and when advertising was done in physmail, I started to receive lots of mail about object-oriented stuff and little about other kinds of programming. "Ah, we're winning," I concluded foolishly. Later, I realized I was just pigeon-holed in a special Hell where I would never again learn about what others were doing because someone thought they had learned what I "liked".

It amazes and saddens me that a whole industry grew up around "personalized interfaces" which does not include as part of its regular practice: "ask the user what he likes". Amazon's court of last resort is to allow me to "correct" it assumptions about me by deleting records of specific purchases that are confusing its belief that I like certain things.... all substituting for an interface that just says "do you like X?" and lets me say "yes/no". And there's even some research saying they know better than I do what I want. Bleah. Personal indeed.

I'll be interested to see if this result holds up. It seems just as grim as the "personal interfaces" result. But sad or not, it does seem believable...

--

Kent M Pitman
Philosopher, Technologist, Writer

Re:People who like this kind of thing... by dargaud · 2006-08-17 19:22 · Score: 1

You think it's bad now, imagine when Google has an AI model of what you want to find such that it tailors the search results for you alone
Yeah, I can see that. I made the mistake of ordering two baby books off Amazon as a gift for expecting friends. I was then bombarded with baby-related adds from Amazon for years afterwards although I can't care less about those creepy smelly things.

--
Non-Linux Penguins ?
Re:People who like this kind of thing... by martiojd · 2006-08-17 21:02 · Score: 1

It reminds me of the quote (not sure the origin): People who like this kind of thing will find that this is the kind of thing that they like.

Well out of all the places where you can find out who wrote this... Google!!
According to the quotations page it was written by Abraham Lincoln in a book review (I wonder which book that was). They give the precise quote as "People who like this sort of thing will find this the sort of thing they like".
Re:People who like this kind of thing... by fyngyrz · 2006-08-18 04:14 · Score: 1

I think you're being way too short-sighted about what AI will mean.

A reasonable use of AI at a search engine would be in creating true experts in every field, no matter how broad, narrow, popular or exotic, experts that know not only the broad and popular strokes, but the most intimate and tweaky details, and they would actually be able to understand what it is you were searching for, and go get it for you. Not what is statistically common, but actually what you're trying to find. Because they understand what you're trying to find, as a consequence of being actual experts in the area, not because they're aggregating several million people's random clicks.

While it is certainly possible that AI will be used the way you imagine, the search engine that uses it the way I am describing here will kick the ever-loving arse of anyone who does try to make AI the purveyor of an ever-narrower and more populist view of what you are looking for. Remember; What Amazon is doing is not, in any sense, "AI", it's just some crappy program that uses statistics. Quite badly.

AI will change everything, and I'm almost certainly understating the case at that.

--
I've fallen off your lawn, and I can't get up.
Re:People who like this kind of thing... by NetSettler · 2006-08-23 18:34 · Score: 1

I think you're being way too short-sighted about what AI will mean. ... What Amazon is doing is not, in any sense, "AI", it's just some crappy program that uses statistics.

I'm not in real disagreement here. When I made the ill-advised use of the term AI, I was really meaning "commercial smarts", which I meant more ironically than literally, since I expect it to fall far short of real AI, and to be just "smart" enough to be dangerous.

I agree that real AI, if it were to be had, would have very different effects. Then again, those effects might also include, in addition to your list, things like snubbing users it doesn't like and pondering whether it should get a better job. ("Don't we have people for these boring jobs that require no thought?".) True AI, after all, needs to admit the possibility of free will. And given that you had both intelligence and free will, would your first thought be "maybe I should get a job taking orders at Amazon so I'd have something to do?"

--
Kent M Pitman
Philosopher, Technologist, Writer

Search results still crucial to some businesses by QuantumFTL · 2006-08-17 16:15 · Score: 3, Interesting

Not to obnoxiously plug, but lylix.net, a Linux/Asterisk VPS host that I consult for, has gone from a single-man show with few customers to nearly overflowing with incoming business as a result of an aggressive "white hat" SEO campaign - mostly just putting up good content on the site in a format that search engines like (and probably also the thousands of links from slashdot from my sig/homepage).

These results surprised me very much - I've gotten over a thousand hits on lylix.net as a result of my postings in the last month and a half, but this is easily dwarfed by lylix's position as the 3rd hit for 'asterisk VPS', first for 'linux asterisk vps', and being 4th-5th page for just "VPS".

For those who can put up quality content and carve out a decent search rank, Google is a veritable gold mine. Yes, it's possible that looking at the internet through Google's lens gives a skewed perspective, but it's still the best way to find most things. Word-of-mouth is find for big sites, or niche sites known by your friends, but I can honestly say I do not find most things online that way.

circular reasoning? by martin-boundary · 2006-08-17 17:18 · Score: 1

It's hard to tell how interesting these results are from second-hand information (the original paper isn't available freely, you have to pay for it), but the writeups aren't particularly surprising. So this should be taken more as a criticism of the writeup than the (unknown) paper.

1) The biasing effect is not hard to calculate _exactly_, for example it's done implicitly in this old paper, see p.6 the paragraph after eq.10. Of course, it's well known that Google hasn't used PageRank exclusively for years.

2) PageRank's formula is well known, and doesn't just count the number of inlinks, but uses a "boredom" probability of about 0.2 (as explained in Page and Brin's original papers at Stanford, I think they used 0.15). To be precise, PR is the weighted average of 0.2 times a uniformly random measure and 0.8 times a matrix based on the number of inlinks. See a pattern? It's not surprising that the inlinks should only account for about 0.8.

3) Judging from a couple of older papers available online by the researchers, they've spent some effort to work out an approximation to PageRank using inlinks. The idea being that inlinks is easier to estimate than PR or whatever modified PR Google uses these days. Now they're looking at the inlinks empirically, and they're finding a factor of 0.8 associated with Google. Well, duh! That would be circular reasoning.

4) If the data they're using is recent and sufficiently significant, it might suggest that Google's secret PR algorithm is only a second order modification of the original PR, ie that even though the real PR is secret, it can be well approximated by the original Stanford PR. That in turn is both exciting and troubling.

Well Duh by logicnazi · 2006-08-17 18:23 · Score: 1

This is pretty obvious.

If links were the only way to find new web content then the number (and popularity of linking sites) would totally determine a websites popularity (modulo a bit of advertising).

Now if you believe that at least occasionally people find sites through search engines that weren't linked to from any of the sites they normally visit the search engine reduces the impact of popularity. All you need is one example of someone searching for "f22 raptor cost overruns" who doesn't browse milatary/political websites and the search engines have reduced the impact of popularity.

I always thought the criticism of google was that their choice in search algorithm did less to reduce the influence of popularity than it could. I don't find this a compelling criticism, wisdom of the masses and everything, but it is at least a cogent point.

--

If you liked this thought maybe you would find my blog nice too:

0.8 by FirienFirien · 2006-08-17 18:38 · Score: 1

I'm astounded that they think the correlation should be 1:1. Using some arbitrary figures:

If you have a large web page with 4 million inwards links, and you put the link in a million more places, you're 25% more visible - but part of the 25% that can now see the link in the new places will have known about the site before, and those people then don't add to the figure even though they've been targetted by the new advertising.

If you have a small specialised web page with only 40 incoming links, you're only being found by people who have criteria that fit your particular company; assuming here that it's not just from being a web fledgeling, you've only got a small userbase inside the specialisation who will come to your web page, and chances are they'll probably know about it. If you add 10 more links, then sure you'll get more people - but the people who are your target audience are likely to know about the site anyway, whether via magazine/word of mouth/forum discussion.

Unless your company is special, and is in the startup phase of getting to the relevant people - where the target audience hasn't found out about the site yet, and adding 25% to the links, by being in the right place, reaches that audience. You might get a return of greater than 1 if you do it in the right way there; where you were previously known by only a fraction of the target audience and can via google adwords or whatever suddenly reach a far far far more reaching audience, you'll get good improvement on your visitor numbers.

A major assumption in the whole thing is that each company assessed considers the entire markettable world as a potential customer base. By targetting 25% more people you'll get 25% more interest? Even if we assume that the extra people don't know about the site already, that'll only work if your product is interesting to 100% more people, which in the world of the web seems fairly unlikely.

--
Browsing with +2 to insightful posts and a higher threshold makes the average post seen seem a lot more ingenious

Let me summarize ... by utnapistim · 2006-08-17 18:43 · Score: 1

... the article for you:

The desirability of a website is not given by how search engines rank it but by it's actual content.

Well ... yeah!

--
Tie two birds together: although they have four wings, they cannot fly. (The blind man)

This morning by Jarth · 2006-08-17 19:18 · Score: 1

I had a 'vision' of an article discussing google on slashdot and hah! behooold! Maybe all these digits are getting to me after all. Eeery ... Anyway, The equation that came to mind was a bit like follows.

A scientist who makes an unusual discovery is alsmost certainly to get critics all over him. Yet, in time his discovery will be recognised as the result of an intellectual effort, an achievement. This scientist will become known as 'a smart person'.

Discarding the percentile of scientists who succeed at setting such a milestone and looking at people with scientific capacities (for the sake of argument, 5% of the googlers) one can only argue the search results in google will only become more irrelevant to the intellectual part of our society. So the results of google will become increasingly insignificant to the more educated part of the population, maybe even plain scholars.

This is of course not true for most specialisms and so on but even now sometime results are quite insignificant.

The signs are allready here.

--
free dom(inion) - free energy - free your mind - whee!

other possible interpretations by Anonymous Coward · 2006-08-17 23:07 · Score: 0

There are other possible interpretations for the sublinear scaling observed in the data. For instance, the quality of search engines might decrease the motivation for linking to already popular sites, whereas people may feel more motivated to link pages that do not appear among the top hits returned by search engines. Our search model, however, presents a very compelling explanation of the data because it predicts the traffic trend so accurately using a minimal account of query content and making strong simplifying assumptions, such as the use of PageRank as the sole ranking factor.

Inbound links is already discounted by 140Mandak262Jamuna · 2006-08-18 00:47 · Score: 1

Expecting the traffic to a site to increase in direct proportion to the number of inbound links is completely stupid. Let us say, for example, my site gets one inbound link from google.com main page. I will get some traffic. Then I get another inbound link from the home page of IIT-M Alumni Association of Allegheny Valley. Now with two links to my page, you think I should get twice the traffic? How stupid is that? All sites dont have equal traffic. Unless you weight the inbound links with popularity of originating sites, it is a meaningless exercise.

Further the search engines themselves allot page rank by the number of inbound links and the keywords found in the "a" tag of originating pages. So more inbound links will raise your page rank, get you ahead in the search listings and get you more traffic. But the traffic will be counted as a "search engine" generated traffic not as traffic originating from a referring site. With this much of interdependance between page rank and the number of inbound links how did the study control for it?

The number of inbound links is already reflected in the search engine generated traffic, or to use Wall Street parlance, it is fully discounted. There is nothing to see here. Move On.

--
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact

Not all of us look up baseball... by everphilski · 2006-08-18 01:31 · Score: 1

When I'm doing engineering research I often have to repostulate what I am looking for to Google several times before getting the results I need.

For example last night I was working on a calculation involving "k", the heat capacity of air, in english units.
"heat capacity of air" would give me the answer in metric ... that's great and all but I didn't have my handy conversion table in front of me (and didn't want to spawn a second search... which is part of the point, here)
"heat capacity of air english units" mostly returned specific heat results, not heat capacity results.
Finally after 5 iterations I resorted to finding the conversion factor...

Re:Not all of us look up baseball... by Bob+Uhl · 2006-08-18 02:37 · Score: 1

I didn't realise that anyone still used real units to do engineering work--it's nice to know that there's still some sanity left in the world. Any suggestions for engineering references using English units?
Re:Not all of us look up baseball... by rtb61 · 2006-08-18 20:46 · Score: 1

I found that using firefox my searching has vastly improved. By right click and open in new tab, I retain the original search and can add additional terms to it whilst reviewing a selection of the results originally produced.
I generally open up a half dozen or so of the most likely links on the first page and based upon a quick review of those, revise the original search terms as necessary and refresh the search page and whilst the search page is refreshing, I take a closer look at the links I had opened in other tabs.
On an interesting side not, when you right click open in new tab, the result doesn't make in into the click through search metric, for those that log in you can check this in your search history. So for those people with privacy issues, until google alters the links on search results, right click open in new tab under firefox to preserve a bit of more your privacy.

--
Chaos - everything, everywhere, everywhen

What's easier? by aclarke · 2006-08-18 05:23 · Score: 1

cmd-k for searching, cmd-l for typing an address. It's not that hard, really.

Substitute ctrl for cmd if you're a Linux or Windows user. All this assuming you're using Firefox.

--

www.clarke.ca

Re:What's easier? by KlomDark · 2006-08-18 07:16 · Score: 1

Where did they come up with those two? K for Search, L for an address? Doesn't make sense, hard to remember.

Kearch? Laddress?

Long tail? Short tail? No tail? by aclarke · 2006-08-18 05:41 · Score: 1

First I'll admit I'm a little confused by the article. Are they measuring a page's popularity in a search engine by its number of inbound links? So they're saying that as the number of inbound links increases (i.e., in their opinion, the site's ranking in the search engine), the number of page visits increases? Maybe I'm missing something, but if that's the case this research raises an eyebrow here at least. If they have page ranking data from Yahoo, why not use that instead of inbound links? Or maybe by "page ranking" they mean "number of inbound links".

I guess people need to study something, and sometimes one will come up with surprising results. But this study reminds me of the "Long Tail" discussion that was all the rage for a couple weeks. "Wow, with the internet we can find niche information!" Who didn't know that? So now some information pundit (don't remember his name) gets to make a bunch of money for putting a name on a self-evident truth so that business types can sit around and discuss it like it's something revolutionary.

If people didn't use the internet to find things, then why would Google be worth billions of dollars? If people didn't use the internet to find things, then why would companies be paying Google huge sums of money for page rankings? Those who track ROI will usually tell you that it makes them money (and if not they'd stop doing it, if they were tracking ROI).

I don't follow professional sports, but a lot of people do. So a lot of people are going to search for "Toronto Blue Jays". Good for them. There are ~6 billion people in the world, each with some common interests, and each with some less common interests. If you're making a web site to sell iPods then likely you'll be lost in the crowd and have a hard time gaining traction. If you're selling refurbished vintage Massey Ferguson tractors with patent leather seats and Corvette LS1 engines, then likely you'll end up #1 on Google pretty quickly. Do you want 0.0000000001% of a billion dollar market or 100% of a $0 market? Your choice.

--

www.clarke.ca

Fortunato et.al.'s work in context by Anonymous Coward · 2006-08-18 07:56 · Score: 0

This seems to be another one in a string of papers by Fortunato et.al.; the previous ones were The egalitarian effect of search engines (from arXiv, which never seems to have been published properly), and Googlarchy or Googlocracy (from IEEE spectrum.)

It was even featured on slashdot before: Search Engine Results Relatively Fair, posted by Zonk on Sat Nov 19, '05 04:29 AM.

But they seem to have improved their reasoning this time: They finally cite Donato et.al.'s work (Large scale properties of the Webgraph), which explicitly contradicts their claim that there is a correlation between in-degree and pagerank.

Regards, Sebastian.

67 comments