WWW Surpasses One Billion Documents
Gary William Flake writes "A new study by Inktomi and NEC Research Institute show that there is at least one billion unique indexable Web pages on the internet. The details are pretty interesting; for example, Apache dominates the server market.
"
Longest domain name:. taxrepresentation. o m
http://www.tax.taxadvice.taxation.irs.taxservices
taxpayerhelp.internalrevenueservice.audit.taxes.c
gee. A tax site with a long, unintelligble, confusing domain name. Go figure.
"You want to kiss the sky? Better learn how to kneel." - U2
Sig:
Barbeque is a noun. Not a verb.
The net will not be what we demand, but what we make it. Build it well.
For all you know - the web has surpassed at least 1 webpage count. Big Fscking Deal!!!
Why is one of them Hamster Dance? Don't go there with an 18 month old child on your lap. For an adult, this is funny once. For a toddler, it is funny every time the computer is on.
The net will not be what we demand, but what we make it. Build it well.
dynamic content makes the technical quantity of distinct "pages" far greater than a billion.
Well, as any of us geeks know, this isn't really news. I'm sure we passed the billion mark a long, long time ago. Inktomi just wants the publicity, and some news service will probably pick this up, most likely CNN.
One thing of interest, though. If you look under the "Web server market share", Red Hat and mod_perl are apparently web servers now.
Finding information on the web is going to increasingly be like trying to find hay in a needle stack. Already the current indexing engines can't keep up, and you have unscrupulous web authors putting bunches of keywords unrelated to their site in their meta tags to insure that they get mentioned in every single search. Some indexing engines already ignore meta tags for that reason. And how many times have you tried Altavista, Excite or Google only to find that the page you're trying to get to has expired or is 8 years old and hasn't been changed in 7?
This issue is going to have to be addressed, because the web is going to continue growing.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Really, this article says nothing. Unless it states (and it does not) *exactly* how they mean "unique" I'm not going to take this seriously. A more interesting statistic (and one I haven't seen updated in awhile) would be what the information conversion ratio is between the "RealWorld" and the web - ie: how much information that you can find in a library can you also find online in it's entirety. That is a more accurate measure of growth than raw page numbers.
49.5% Broken links to mp3s
49.5% pr0n pages with javascript popups
1% other
We humans should be so proud of ourselves.
:)
This sig is false.
Google is one of the best search engines available for most purposes, because it ignores meta tags, and scores pages higher based on links to the site from other high-scoring pages (this is a recursive definition but the recursion bottoms out).
The result of this is that it gives useful results even when very common words are used. Try searching for Linux on Google. The first ten results are
While a human being might be able to come up with a better list, a machine came up with that list, based solely on the structure of the web. (I wonder why linux.davecentral.com rates so high -- possibly because it's attached to a high-ranking site, davecentral.com).
ObAdvocacy: and Google runs on Linux.
Well, my take from the site that what they're actually saying is "Look at our lovely indexing cluster. It can index 1 billion web thingies! Shouldn't you be buying an search engine product that powerfull?
Or, in other words, it's another example of meaningless statistics spewed in the name of marketing, vaguely covered-up as serious research.
References: Car MPG & top speed figures vs actual usage, Processor MHz as function of system throughput, quoted battery life as function of laptop utilisation, quaketest FPS compared to average internet multiplayer experience etc etc etc...
--
I'd rather have a bottle in front of me than a frontal lobotomy