Modeling Linking on the Web
An Anonymous Coward writes "Amazon has a much greater market share among online bookstores compared to the greatest market share for offline stores. How is this possible? Because the web changes how people find information. There are millions of links to Amazon on the web, which makes it more likely for people to find Amazon when surfing the web, or when using search engines which typically use link popularity in ranking. This makes it harder for new businesses to compete. Researchers have discovered that across the entire web, links are distributed according to a "power law" which leads to "rich get richer" or "winner's take all" behaviour where a small number of sites get the vast majority of links and traffic. A new study just released by NEC shows that this behaviour varies in different communities, and shows how to predict competition in different areas. For example, you can see how much tougher competition is among booksellers compared to photographers."
Wouldn't this prove that the quest for eyeballs was no more crazy than the quest for a starlet to become a Hollywood star, the quest for a high school quarterback to make it to the NFL, or the quest to win the lottery?
leads to "rich get richer" or "winner's take all" behaviour where a small number of sites get the vast majority of links and traffic
No kidding. Look at the hit counter on my homepage, and compare it to Slashdot's. I've probably gotten as many hits in the last year as Slashdot has gotten since I started typing this reply.
I'm off to work now, so I don't have time to check, but does the article address the massive amount of advertising that Amazon did to get where it is today? Inertia has to come from somewhere.
--saint
Amazon does a lot to get their name out. So its very reasonable that most people would tend to look towards them for books.
Ask people off the street where they would buy books on the internet... and your bound to get many replies of Amazon or "I don't know".
I suppose by their logic that the only place to be on the web is AOL... but on the street you could get that response as well.
Advertising does pay. Links on the net may lean towards one provider over another, but most of them were bought by one method or not.
One reason the rich get richer is because they are optimist, they are willing to do it now. The best time to start a business is always TODAY... the best time to get your name out there is ALWAYS today.
* Winners compare their achievements to their goals, losers compare theirs to that of others.
the internet is a very young medium and most online buyers just don't trust much... yet. I'm a regular buyer, but wouldn't really consider buying a CD from a company that just showed up and is hosted on some obscure domain.
Wait until online buying settles down a bit, and everyone gets used to buying stuff everywhere. Then wait for VISA to become more secure (read : Hardware cardreader devices at home, which is inevitable) or for banks to do direct transactions between you and the book reseller. THEN we can start to evaluate models.
When will I end this grieving ? When will my future begin ?
This is also called a scale-free network, and the research on it, by Albert-Laszlo Barabasi (currently at Notre Dame U) is in this week's New Scientist. (Apologies, it's not on their site yet - www.newscientist.com) He's applied it to many systems other than the web as well, from viral transmission on the net and human populations to the vulnerability of "hubs" in genetics (a few, like p53) would take out damn near everything due to their pervasiveness and even quantum mechanics.
toeslikefingers.com - because
Where does the pornography industry fit into the chart?
And can you provide a few links there as well? Just to "sample"
I don't know if this applies or not, but what about the rise in popularity of Google?
Clearly they came in late, after all the other search engines were established, but they ate up market share and eye balls because they were (and are) better.
So I could imagine that there are some Amazon-killers out there or that at least there could be...
In Soviet Russia...michael would be rotting in Siberia!
While this is probably true for the most part, things can change it.
I'm in Ireland and since the euro came in to being I've stopped ordering from the UK or US versions of Amazon, and other retailers for that matter.
With currency and shipping costs they were killing me. I now look for eurozone suppliers.
I'd still love a good Irish online store though, just to save shipping, but the are all crap half assed efforts.
At least my french and german will improve while browsing.
I don't think that competition for photographers is a good example because location plays a big part in choosing a photographer. I know that some photographers travel alot but they are mostly chosen because of their reputation (often in spite of their webpages ;) )Also, the market share of offline stores is lower due to logistics. For example, all offline booksellers must have a location to display the books and a way to get them there. This costs money.
I also think that the "rich-gets-richer" thing is more about economics than about the number of links a site gets. Amazon buys bigger quantities, and ships more quantities which means that the economies of scale are a HUGE factor. I would venture that the number of links will be a trailing indicator of the relative size of a retailer much more than a contributor to this relative size.
To sum up logistics, economies of scale...intro economics anyone?
The power law may be the same as the Pareto distribution, which models the distribution of income in an economy. Economists have observed it experimentally for decades though the theoretical reasons for it have only recently become understood. The underlying mechanism could well amount to the same thing. Some economists might like to take a look at the web link data. There are probably interesting comparisons to be made between link distribution and income distribution.
I would speculate that Slashdot is one of those rich getting richer, like Amazon. It has reached a critical mass of readers. All the other message boards, and I have been to many, are relatively sublime in comparison.
;)
Slashdot type boards might be an exception since their interest - their value (or perceived value) to readers, their content, is defined by the probability of a reader writing something, which is the most interesting factor to take into account. Of course, once enough readers are there, statistical odds of a story getting a full compliment are very high, and thus good content. Slashdot seems to be past that critical mass.
Then come the trolls.
I have a horrible feeling that the words "self-organized criticality" are heading inexorably towards this discussion. Please, just say no.
No, seriously: it doesn't look like the article is making that particular connection, thank goodness. Per Bak's theory of self-organized criticality predicts power law distributions of many things under many conditions, so people (or, specifically, Bak fans) often get all excited whenever they see a power law and start saying, "Hey, it's SOC!" But there are many ways to get power laws, and of course the logic doesn't work: "SOC gives power laws; thus if it has a power law, it was caused by SOC."
If you want to know more about SOC, you can check out Bak's modestly-titled book, How Nature Works.
For example, you can see how much tougher competition is among booksellers compared to photographers.
No, no, no. This is an abstract discussion of new avenues in Communication Theory as exposed by the Internet. Therefore we must apply Internet-specific terminology to our discussion. "Photographers" should read "Pornographers." Thank you.
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
It's happened over and over again in the economy in the U.S. elsewhere. We're already seeing it happen again fairly rapidly in the U.S. (and probably other places) in a lot of service and utility businesses-- although some of that is a result of deregulation. Online business just speeds the process up.
The big benefit of the public Internet isn't the speed at which the rich get richer-- it's the speed at which something else can come along. The (relatively) low cost of having a presence online, and the speed with which one can establish a satisfying niche, is what made the Internet the "equalizer" in the late '90s. And everyone will still have a good shot at creating a niche and possibly overtaking this year's "big guns" on the Internet, unless a few organizations get enough power to control whether one "free" person can comunicate with other "free" people arbitrarily.
One of the things we hear a lot on Slashdot is how evil RIAA/MPAA and the media companies behind them are. And what's often brought up is the fact that these companies are screwing the artists, that these artists would do better to establish a direct relationship with their audience, and that the internet is perfect for that.
I've always been sceptical of this argument because I think the effect of promotion is a lot greater than people generally believe. A musician playing live can generate a little local word of mouth - but how do you translate that into a wider appeal?
This adds another reason to be sceptical. So maybe getting onto Amazon's stocklist/recommendations will become more important than getting on a radio playlist?
This article is basically a fancy way of confirming the tyranny of the majority. Google's PageRank, as good as it is, both a) suffers from and b) perpetuates the tyranny of the majority (aka "the rich get richer", the "power law"). IE, the more links, the higher the pagerank, the more relevance, the more hits, the more links...
Teoma seems to be aiming at this chink in Google's armor.
From Teoma's page,...
Using vectoring algorithms to find themed hives of related content, Teoma partitions the power law into manageable chunks. IE, the rich get richer, but at least a dominant site in one field doesn't get artificially inflated relevance when querying an unrelated field. At least in theory. (Kinda like laws are supposed to keep a monopoly from illegally entering other markets, but I digress.)
This is working for Teoma: I (and others) are finding useful stuff on Teoma that Google didn't.
Google is already aware of this particular limitation of PageRank, as can be seen from what they suggest programmers submit to their programming contest...
Even with all that, I still think that humans are the best filters (and isn't a search engine just a programmable filter?). I suspect the rise of weblogs might have something to do with the usefulness found in tapping into some weblogger's idea of what's useful/cool/interesting.
So perhaps the best way to find good info is a cross between a human and a content-vectoring search algorithm. Maybe that's why Ask Jeeves bought Teoma.
Well, in "Cybernetics and Society: The Human Use of Humans" by Norbert Wiener, the author talks about messages--communication. Links to a web site are messages given to people using the web that a given site/page/whatever exists. They are easy to make use of, since all you have to do is click it and whatever is on the other end is given.
.com ad explosion on TV failed to do anything..)
This is all fairly obvious. The neat thing is how these messages and the messages they point to interact. Dr. Wiener says that the more unique a message, the more "important" it is. This is simply because an overused message (a cliche) quickly becomes filtered by the human mind, and loses its meaning.
Take the Amazon example, for instance: How often do you click a LINK to Amazon? Yes, there are hundreds of thousands of links to Amazon, but I would guess most of them NEVER get clicked. Why? Because there are too many of them. The first time I saw one, I followed it, but now I just ignore them. I almost never click on Amazon links because I know it goes to some bookstore.
When I do go to Amazon, I just type the DOMAIN name into my browser, and go directly there, and do my own searching, follow my own links.
So, basically, there is an upper limit to the number of links before they essentially become useless. Of course, this upper limit is dependent on the total number of users who haven't seen the links, which is increasing every day as new people come on to the web. As the number of links reach this critical mass, more and more people are just typing in the domain name rather than following a link.
This is Google's essential flaw. It does not recognize that a site like Amazon does not need an entry in a search engine. There are enough links out there already for just about anyone to find it. Google should instead group searches around a bell curve distribution, where the sites with the medium number of links have the highest relevance, with underlinked and overlinked sites falling off the ends.
How are new sites found out about and linked enough to show up in an engine like Google? Advertising. Mostly word of mouth and link ads, and in certain cases print and television advertising (although this is less effective, because it requires the user additional steps to make use of the message (ie: remembering the domain name at a much later time and then typing it in), which is why the
Really, to be effective, you need to have 10-20 contacts online, have each link to your site. Spread the word as much as you can. And save your ad budget until your word of mouth traffic reaches critical mass. Then spend it on bandwidth.
Really, time is the only key. Oh, and having something useful or funny.
Anyway, this quickly turned to the theories of getting a lot of hits, and I apologize, but you can see that the middle is the best place to be, and maybe Google will recognize this. This would do a LOT for online commerce, and the economy in general. Support bell curve relevance.
Cheers.
Cool! Amazing Toys.
This is true of Webcomics as well. Ask someone what their favorite Webcomic is, and they will almost invariably respond with one of the following: User Friendly, Penny Arcade, PvP, Sluggy, Sinfest, Megatokyo or Exploitation Now. With the exception of Penny Arcade, I have found the total combined quality (art + writing + humor) to be fair at best, and atrocious at worst (guess what the worst is; hint: think of a little dustball with feet). But these sites are linked to from all over, and they often link to each other, creating "flash crowds" from Slashdot, other comic sites, personal home pages, etc.
There is a class of "second tier" comics which have nice little followerships: Little Gamers, Sexy Losers, Polymer City and Cool Cat Studio (really, any Keenspot comic that isn't Sinfest or EN) are among these. Everyone else, myself and my comic included, is "third tier", i.e., tumbleweeds rolling across their allotted server space.
Then there is Pokey, which stands conspicuously on its own. HOORAY.
N4st0r, trixx0r h0bb1tz0rz! Th3y st0l3 0ur pr3c10uzz!
Researchers have discovered that across the entire web, links are distributed according to a "power law" which leads to "rich get richer" or "winner's take all" behaviour
Ah, no that's not what it shows:
Quote from the research: "In fact, pure power law scaling appears to be the exception, rather than the rule." What the research shows is that "winner takes all" varies across the web between categories.
Also, the researchers have (I suspect) a mistaken view of "competition" and competitiveness. They rate the Amazon category of book-sellers as "more competitive" - when in fact it may be less competitive in economic terms, being dominated by one or two sites/sellers. Whereas the photography caregory may be "more competitive" because of a larger number of rivals of about the same size.
What they mean, I suspect, is that the publications/Amazon category has higher barriers to entry - the amount of adertising etc being a greater sunk cost, and likely to deter any aspiring internet book retailers. In purely economic terms, that makes the category less competitive. As an illustration, ask yourself if the operating system software market is more or less competitive because it is dominated by one large brand in Microsoft?
They kick back 5 to 15% to whomever provides a link that leads to a sale. That's not small beer. They make it easy for anyone to provide these links. So of course they're all over the place.
Is called VisIT. It produces a graphical representation of how sites link together, based around any given query. It was used quite sucessfully to demonstrate how Scientology had spammed Google, by creating multiple domains all linking back to their main web page.
It's a freebie download and you can get it here.
Alas gallinaceas de urbe bovis volo
The Atlantic has a GREAT article about this effect:
t m
http://www.theatlantic.com/issues/2002/04/rauch.h
An exerpt:
"Every so often scientists notice a rule or a regularity that makes no particular sense on its face but seems to hold true nonetheless. One such is a curiosity called Zipf's Law. George Kingsley Zipf was a Harvard linguist who in the 1930s noticed that the distribution of words adhered to a regular statistical pattern. The most common word in English--"the"--appears roughly twice as often in ordinary usage as the second most common word, three times as often as the third most common, ten times as often as the tenth most common, and so on. As an afterthought, Zipf also observed that cities' sizes followed the same sort of pattern, which became known as a Zipf distribution. Oversimplifying a bit, if you rank cities by population, you find that City No. 10 will have roughly a tenth as many residents as City No. 1, City No. 100 a hundredth as many, and so forth. (Actually the relationship isn't quite that clean, but mathematically it is strong nonetheless.) Subsequent observers later noticed that this same Zipfian relationship between size and rank applies to many things: for instance, corporations and firms in a modern economy are Zipf-distributed."
It's one of the best articles I've read in a long time, demonstrating how they've managed to model not only extinct populations accurately (who knows how much after-the-fact tweaking went on, but...) but race riots and honesty in social groups.
Add to that, I spent a good fifteen minutes trying to find it again, so someone had better read it. It's just under 10,000 words.
PS - I strongly doubt it'll get slashed, but if it does, here is the Google cached copy.
My
Limekiller
We need websites, like googles Directory, that shows all the webpages out there that are in a certain category. For example, bookstores. Buyers not only see amazon.com but they can see all the other bookstores out there too.
I think when people do a search for something in google, google should not only return links to webpages, but also links to directories on their site that contain other sites that are not very popular yet.
Outdoor digital photography, mostly in New Engl
The quest to win the lottery is a poor business model, and not the way to support multiple people. Most actresses start out (and end up!) as waitresses, most football players get a real job after they fail to make it to the NFL. And most dot-commers got a wake-up call when the money ran out.
Remember that what's inside of you doesn't matter because nobody can see it.
In general search results, Google ranks a page based not just on its PR (ie on how many pages link to it and their overall quality) but on the TITLEs of those pages and the anchor text of the links. So Microsoft could put the string "book reviews" on their home page but they'd still rank behind me on a search for "book reviews", because none of their incoming links are from pages about book reviews.
Danny.
I have written over 900 book reviews
They may specialize and carry more of certain kinds of books, but you won't see much difference between The Hunchback of Notre Dame as carried by Amazon, Barnes and Noble, or Borders.
Meanwhile, each and every photographer has his or her own unique vision which they commit to film. Two photographers can snap pictures of the same subject, but composition of each shot would be different.
Booksellers usually sell the same goods. Photographers usually sell their own product, which can't be found anywhere else.
You cannot truly appreciate Dilbert until you read it in the original Klingon.
It seems to me that the article is nonsense. There are a lot of links to Amazon because Amazon is running a very active business in selling books.
CmdrTaco said recently that efficient search engines like Google make the web flatter and more democratic. It has been my experience that this is true.
In search results, my book, What should be the Response to Violence? is often ranked just below stories from large news companies.
If you search for "books", you will find Amazon. If you search for something more specific, you may find a small, specialized bookstore.
The rich still get richer, but those who are not rich can now be heard.
If I were starting an online bookstore, I'd pick one subject I knew well, and try to build a reputation as the center of that online community.
I think most users are not yet sophisticated enough to easily find alternatives. Also, they *expect* large mass-market retailers instead of high-quality specialists -- that's how they've gone shopping most of their lives.
Once they gain the skill, and they learn to expect better than mass-market, they may turn to specialized online bookstores (from politicaleconomybooks.com to trashromancenovels.com to thinkgeekbooks.com).
For example, many users I support don't type addresses in the URL field. Most can't search efficiently ('how do you spell "google"?'), much less find sites that don't appear at the top of the search results.
They'll learn of course, and their kids are bloggers and already use the specialized sites. Then Amazon may find that they do everything, but nothing well.
Most people associate even the concept of purchasing something online with Amazon... they were pioneers, and managed (using linking, etc) to embed themselves in the mind of the consumer. I must admit; even though I may like to think of myself as one who resists marketing, when I need to buy a book online, where's the first place I think of going? Amazon.
Insidious, ain't it?
I am the village idiot i don't have anything to do with this pathetic little opera i just felt like passing through... - PDQ Bach
I can't believe it's not lard!
And allow me to add, as a publisher of multilingual books that B&N will only stock English titles where Amazon takes any language. I think that's a real failing on B&N's part. They're not even trying to accomodate the global readership.
You have to understand that the number of autistic geeks who have a problem with Amazon's "fsking commercials and screaming colors" is just too small to be of any consequence. Most people, including myself, just don't have a problem with it for what we get in return.
Also whats with the Anti-Amazon sentiment? What exactly is wrong with a company surviving in part due to ad revenue? Does the immature desire for online companies to try to function in this world without advertising revenue still exist? Do you not know that Google has paid ads, in text form, on their site as well? And that they derive a lot of revenue by being the engines under AOL's and Yahoo's search engines?
Mac OS X and Windows XP working side by side to fight back the night.
Credit cards are not the major problem. If you get scammed online, you can get a refund from Visa. A bit of work, but not a huge deal, and it is not very common. (Visa kicks out merchants who scam.)
The real problem is not going to go away: will the merchant ship my order in a correct and timely fashion? This is the *exact* same problem faced by the old "mail-order catalog" business. That business is over a hundred years old and the problem still persists, reputation matters a lot.
How to fulfill orders promptly and correctly is a *huge* difficult problem, involving massive automation, computerization, use of bar codes, etc.
Since Google rewalks the internet roughly once every month, all it takes for a competitor to have equal link-share is to completely spam the internet with thier links (Maybe going down the list of sites that post Amazon links and offering to pay each and every one of them to change it to a link to your site) and procede to wait one month. One month for a total reversal of mind share is absurd in the "real world", so it looks like the internet isn't doing quite so bad after all.
"Your superior intellect is no match for our puny weapons!"