Slashdot Mirror


WWW Surpasses One Billion Documents

Gary William Flake writes "A new study by Inktomi and NEC Research Institute show that there is at least one billion unique indexable Web pages on the internet. The details are pretty interesting; for example, Apache dominates the server market. "

157 comments

  1. Re:First Post by Anonymous Coward · · Score: 0

    Porn and quotes

  2. mcdonald's effect by dayeight · · Score: 1

    I can just see giant billboards now:


    Apache: Millions and Millions Served
    (For Free!)

  3. Nice :) by moeffju · · Score: 1

    Not first post, but first with my threshold ;) Seriously, it's nice to see Apache sticking out again. Should do fairly well for marketing Linux.

    --
    follow me on Twitter: http://twitter.com/moeffju
    1. Re:Nice :) by 0xdeaddeaf · · Score: 1

      Maybe it will help as a Linux marketing tool, but Apache runs on so many different flavors of Unix that I doubt it can be contributed strictly as a win for Linux. The big sites are using Solaris and a lot use *BSD. There are some big sites using Linux (according to http://www.netcraft.com), such as eToys, DejaNews, and some little-known site called SlashDot. It is defintely a win for Open Source, and has been so for a long time.

    2. Re:Nice :) by Anonymous Coward · · Score: 0

      Apache is NOT Linux software.
      So there is no way Apache will market Linux, unless fucked up people make assotiations that does not exist.(i.e. stupid faggot journalists and Linux zealots.)

    3. Re:Nice :) by jams757 · · Score: 1

      Who pissed in your cheerios?

    4. Re:Nice :) by Anonymous Coward · · Score: 0

      Hey, asshole, you do not have to be gay to like linux or apache. you do not even have to be intelligent to post at slashdot, apparently. leave the bs 4th grade insults where they belong. /rant

  4. WOW by Vis · · Score: 0

    I can't believe it took this long.

    --
    -- Hi! I'm a .signature virus! Copy me into your ~/.signature to help me spread!
  5. In related news... by Sick+Boy · · Score: 2

    approximately 7 of them are useful.

    --
    Does narcissism count as a hobby? --Shawn Latimer
  6. the best part by Capt+Dan · · Score: 4

    Longest domain name:
    http://www.tax.taxadvice.taxation.irs.taxservices. taxrepresentation.
    taxpayerhelp.internalrevenueservice.audit.taxes.co m


    gee. A tax site with a long, unintelligble, confusing domain name. Go figure.

    "You want to kiss the sky? Better learn how to kneel." - U2

    --
    Sig:
    Barbeque is a noun. Not a verb.
    1. Re:the best part by TheOpus · · Score: 1

      Capt Dan...

      This is not and never was a real site, I doubt it very much. It was definately just used for Search Engine Spamming, nothing else.

      - TheOpus

    2. Re:the best part by Anonymous Coward · · Score: 0

      that is not the worlds largers domain name, it is just a domain with a bunch of sub domains tacked on it. THAT IS NOT THE LARGEST DOMAIN NAME. DUH!

    3. Re:the best part by QuMa · · Score: 1

      First, let's pick that nit: It was probably a hostname.

      Secondly: It isn't anything at the moment, it won't resolve. I can't even resolve audit.taxes.com.

    4. Re:the best part by Nodatadj · · Score: 2

      I had
      in.2032.the.world.as.we.know.it.will.self-destru ct.com, whenever I was running "illegal"* servers off my university network.

      *There was nothing illegal about them, execpt that the university banned servers.

    5. Re:the best part by Otto · · Score: 1

      No, but if you check out http://taxes.com you see it is a domain name for sale by greatdomains.com or something like that...

      it's phony.
      ---

      --
      - Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
    6. Re:the best part by bradipo · · Score: 1

      Unfortunately that domain never resolves... I don't even think that it is valid. The search engine probably just pulled it out of a web page document somewhere. Not a surprise given that it seems to think that RedHat is a http web server (below Roxen). :)

    7. Re:the best part by Anonymous Coward · · Score: 0

      Wow, so you had a vhost... go on IRC, it's not uncommon. Personally I used to have access to me.and.britney.spears.are.mad.sex.addicts.org along with a whole whack of other subdomains like that.

    8. Re:the best part by Jethro · · Score: 1

      I had spirit.of.the.renaissance.co.il, the.lost.continent.of.atlantis.com and is.this.a.gun.in.my.pocket.or.am.i.just.feeling.fr iendly.co.il

      Sysadminning for an ISP does have it's perks...

      --


      In the land of the blind, the one-eyed man is kinky.
    9. Re:the best part by Nodatadj · · Score: 2

      The one I really want
      is
      i.should.co.co

      but I dunno how to register a hostname in columbia (or whereever CO is)

    10. Re:the best part by RAruler · · Score: 1

      Bah.. some guy I know had Microsoft subdomains
      like..
      evil.microsoft.com
      the.fall.of.microsoft.com
      stuff like that.. now those are Vhosts

      --

      --
      Insert Witty Sig Here
  7. Billions Served by Imortus · · Score: 1

    Glorious gravy, the web has has breached 1,000,000,000 indexable pages. And just like radio, network television, cable and satellite before it, the new gag is:

    A billion pages of information, and nothing's on.

    Now, if real life exemplified the web, we'd know that 85% of the earth's population speaks English and, as can be expected, the IRS's domain name proves to be a lesson in redundancy and triplicate.

    1. Re:Billions Served by Moofie · · Score: 1

      Wah wah, there's nothing on the internet, wah wah. Whatever. If you don't think it useful, stop using it. I don't think TV is useful, so I use mine to a) hold down my dresser and b) play Grand Turismo and c) watch movies. I don't carp about how bad TV is, I just don't use it. It's called voting with your feet. The other alternative is MAKE something useful to go on the Internet. Pundits who just sit around and carp about how nothing interesting is happening around them simply aren't paying attention.

      --
      Why yes, I AM a rocket scientist!
  8. And at least one of them already comments on that by dsplat · · Score: 4
    Yes, and the Jargon File already has a comment on that, originally from Theodore Sturgeon:

    Sturgeon's Law prov.

    "Ninety percent of everything is crap". Derived from a quote by science fiction author Theodore Sturgeon, who once said, "Sure, 90% of science fiction is crud. That's because 90% of everything is crud." Oddly, when Sturgeon's Law is cited, the final word is almost invariably changed to `crap'. Compare Hanlon's Razor, Ninety-Ninety Rule. Though this maxim originated in SF fandom, most hackers recognize it and are all too aware of its truth.


    --
    The net will not be what we demand, but what we make it. Build it well.
  9. Meaningless Statistics by (void*) · · Score: 3
    That's what I hate about such "statistics". No information or context is given. One is not told how this estimate of "one billion" is gotten. No details about how the research methodology was forthcoming. Instead one is only supposed to stare slack-jawed in amazing at the touted figure of one billion an be impressed. That anyone can impress oneself that THIS is an achievement is amazing.

    For all you know - the web has surpassed at least 1 webpage count. Big Fscking Deal!!!

    1. Re:Meaningless Statistics by luckykaa · · Score: 1

      I think what it actually was was that a search engine had one billion web pages indexed. Not that I disagree with you. This is totaly meaningless. I have a couple of web pages that no-one has linked to. I don't think they checked the database for broken links.

      How long does it take to count 1 000 000 000 links anyway?

    2. Re:Meaningless Statistics by re-geeked · · Score: 1

      From their "more info" link, they counted more than 700,000 "unreachable" sites, vs. more than 4 million "reachable" sites.

      --
      "You can't get something for nothing." - my grandfather, on the stock market and Reaganomics.
    3. Re:Meaningless Statistics by Postmaster+General · · Score: 1

      They're a tool to be used for many purposes. Luckily, it appears that in this case they're being used to represent facts (nothing is always as it appears to be though.) However, one has to wonder just how accurate these numbers are. You'd need an independant entity to do some type of verification, but then who'd verify those results? The verifier of the verifier, most likely.

      I guess if you keep verifying each set of results, we will eventually reach what could be collectively known as an "accurate" number. But who wants to spend all that time, when we can just take these numbers and assume that they're good? I admit, I certainly don't, and I am happily willing to say, "Hey Inktomi, and NEC Research Institute, thanks for the thorough study and it's subsequent report! I can now sleep better at night knowing that my one web page on the internet is confirmed to be not alone! Way to go!"

      But wait! How am I supposed to know that my one teeny website was included in their numbers?!? Hmmm, guess I'll have to run my own study just to verify, but then someone else will have to verify my report ... bah, screw it. I give up.

    4. Re:Meaningless Statistics by fence · · Score: 1

      Let's ask Mr. Owl-

      Mr Owl, How long does it take to count 1 000 000 000 links anyway?

      Mr Owl: "ah one, ah two, ah three *CRUNCH*-- ah three"

      There you go folks, it takes three to count 1 000 000 000 links. Thank you, Mr. Owl!

      --
      Interested in the Colorado Lottery or Powerball games?
      check out http://colotto.com
    5. Re:Meaningless Statistics by dsplat · · Score: 2

      That's what I hate about such "statistics". No information or context is given. One is not told how this estimate of "one billion" is gotten.

      Remember, 53.4% of all statistics are invented on the spot. Of those, 63.1% are never checked against any reliable source. The rest are attributed to a survey done by Expensive Management Consultants. You can buy a copy of the report from them for only $2499, which includes the introductory price of a year's subscription to their weekly newletter containing the abstracts of other reports you can purchase, at a substantial 10% discount off the regular price that no one ever pays them anyway.

      --
      The net will not be what we demand, but what we make it. Build it well.
    6. Re:Meaningless Statistics by Anonymous Coward · · Score: 0

      your link is broken. should be file:///dev/null

    7. Re:Meaningless Statistics by dogbowl · · Score: 1

      "People can come up with statistics to prove anything, Kent.
      Forty percent of all people know that."


      --

      These pretzels are making me thirsty.
  10. Open Source by Anonymous Coward · · Score: 0

    Apache is still holding around 60%. It may take a while for the industry to change, but eventually the vast majority of software will become open source- at all level. The market will switch to a service model. Software will no longer be 'sold' per say. It will be provided. What will be sold will be ports, extensions, customizations, translations, support, codebase maintenace commitments, and commissions for new code. A lot of business and pr types are against the OSrce movement because they only understand the old model of software business. In the new order there will be lots of money to be made- however it will be harder however to concentrate it like microsoft did. And any company's main and in fact only asset will continue to be their coders.

  11. Heh... by Anonymous+Commando · · Score: 2

    <DrEvil>One... billion pages</DrEvil>

    Sorry - couldn't resist. :=]
    ________________________

    --
    Corporate Jenga: You take a blockhead from the bottom and you put him on top...
  12. Longest Domain Name by pvthudson · · Score: 1
    Check out the longest domain name: http://www.tax.taxadvice.taxation.irs.taxservices. taxrepresentation. taxpayerhelp.internalrevenueservice.audit.taxes.co m
    Of course its about taxes, you got to hand it to the IRS, even their URLs are hard to read and understand. I wasn't able to open this link, can anybody else?

    --


    Its karma, Kramer.

    1. Re:Longest Domain Name by networkz · · Score: 1

      The URL doesnt exist.

      Taxes.com isn't in use (except by a domain hoarding company).

      The IRS also has nothing to do with it either.

      Hmm...

  13. Why? by dsplat · · Score: 3

    Why is one of them Hamster Dance? Don't go there with an 18 month old child on your lap. For an adult, this is funny once. For a toddler, it is funny every time the computer is on.

    --
    The net will not be what we demand, but what we make it. Build it well.
    1. Re:Why? by Bad_CRC · · Score: 1

      http://www.newgrounds.com/assassin/index.html

      choose hamsterdance from the list.

      unless you have a small child with you. (it's the only cure)

    2. Re:Why? by wuukiee · · Score: 1

      Yes but one of the pages is also Hampsterdeath :)

      Tired of watching them dance, blow them up instead!

    3. Re:Why? by dsplat · · Score: 2

      The link you provided doesn't respond well. I think they've been slashdotted. So I did a search at Google for Hamsterdeath and found this. Enjoy!

      --
      The net will not be what we demand, but what we make it. Build it well.
    4. Re:Why? by wuukiee · · Score: 1
      sorry! i lied... the correct URL is Hamsterdeath
      i misspelled it because hampsterdance.com has a 'p' in it but hamsterdeath *doesn't*. oops... well have fun with it anyways :)

  14. technically inacurate statistics by TheCodeMaster · · Score: 3

    dynamic content makes the technical quantity of distinct "pages" far greater than a billion.

    1. Re:technically inacurate statistics by inkognito · · Score: 1

      Yeah, except that Inktomi doesn't index purely dynamic pages. . . . nor give them much relevency.

  15. All I want... by thefatz · · Score: 1

    is that big cluster of sun boxen. Now that is some serious admining and "play?". Just think of clustering them all together and playing a mean game of Quake. Enough proccessing power to run the super bot that is smart enough not to get detected. Im dreaming.

    --
    http://www.freebsd.org
  16. Seems almost insignificant by BlightX · · Score: 1

    Alright, there are 6 billion people on the planet. Many of these people work for companies or governments that have webpages that are probably hundreds of indexable pages. Some places auto-update things, producing hundreds of indexable pages. There are also millions of (pointless) personal sites, and some people manage more than one site. I'm shocked that we're only at one billion now.

    -BlightX

    1. Re:Seems almost insignificant by The+Salamander · · Score: 1

      > There are also millions of (pointless) personal sites

      Personal sites are what the web is for! All
      this commercial and ecommerce stuff is just silly.

    2. Re:Seems almost insignificant by Jon_H · · Score: 1

      There are also millions of (pointless) personal sites

      Why are personal sites pointless ? Just because most of them aren't things you wan't to read doesn't make them, IMHO, useless.

      In fact it's the empowerment that enables Ordinary Joe to publish his personal page that makes the web what it is and not just a virtual shopping center.

      I'd just like all six billion people to be able to participate.


      --
      I used to have a sig but I left it on a bus ...
  17. Apache is the largest by Menthos · · Score: 1
    The details are pretty interesting; for example, Apache dominates the server market.

    Anybody following Netcraft's Web Server Survey already knew this. But it's still nice to get it confirmed from additional sources.

    --

    GNU/Linux. The Freshmaker.

    1. Re:Apache is the largest by gorilla · · Score: 2
      I think this is a different measure.

      Netcraft's measure is by number of servers, while this measure is by number of pages.

      It's not suprising that they both agree, but it's certainly possible that larger sites might have a different server to the average site, causing a difference.

    2. Re:Apache is the largest by god · · Score: 1

      Note that Inktomi say the "number of reachable Web sites" is 4,217,324, while Netcraft found 9,560,866 last month. Isn't this a bit poor for a company that's trying to index the web? Even Netcraft reckons it isn't finding the whole web...

    3. Re:Apache is the largest by gorilla · · Score: 2

      Grabbing just one page from each server is going to be faster than spidering the entire site. Therefore I'd expect netcraft to be ahead of all the search engines.

  18. Indexable webpages by Anonymous Coward · · Score: 1

    The web has infinite amounts of indexable webpages, just look at dynamic webpages and CGI driven webpages. If you want proof go search for [A-Z][A-Z][A-Z][A-Z][A-Z][A-Z][A-Z][A-Z] at www.altavista.com. That's an example of 208827064576 indexable webpages (Each one different). Wee.

    1. Re:Indexable webpages by Coriolis · · Score: 1

      Shades of David Langford's "Net of Babel" (after Borges). Or see here for a real demonstration that the 'net contains an infinite amount of data (although it'd be stretching to call it "information").

      --
      Rgasuya aata! : I have been coding Perl and cannot tell where my fingers are now!
    2. Re:Indexable webpages by Anonymous Coward · · Score: 0

      Click a tab for more on "[A-Z][A-Z][A-Z][A-Z][A-Z][A-..."

      WEB PAGES 7 pages found.

  19. Inktomi, publicity, and mod_perl by billh · · Score: 3

    Well, as any of us geeks know, this isn't really news. I'm sure we passed the billion mark a long, long time ago. Inktomi just wants the publicity, and some news service will probably pick this up, most likely CNN.

    One thing of interest, though. If you look under the "Web server market share", Red Hat and mod_perl are apparently web servers now.

    1. Re:Inktomi, publicity, and mod_perl by annarchy · · Score: 2

      some news service did pick this up...slashdot.

  20. I believe it... by Rantage · · Score: 1
    ...and at least 500 million of those pages are at microsoft.com. I know, from personal experience (just the other day it took me 20 minutes to find PWS for Win95). Ever tried to find something buried in there?

    Online gaming for motivated, sportsmanlike players: www.steelmaelstrom.org.

    --
    Online gaming for motivated, sportsmanlike players: www.steelmaelstrom.org.
  21. Apache Dominates... by Midnight+Ryder · · Score: 1

    Just looking at the top three:

    Apache 60.33%

    Microsoft-IIS 25.26%

    Netscape-Enterprise 3.79%

    Wow - Apache still kicks everyone else's butts, and not by a small margin! I think Apache is about the perfect case for OSS development - not just being a blip on the radar getting larger, but, covering almost the entire radar screen!

    I'd love to see more stats out of Inktomi on this, but, it's still cool to see what little the did provide (261,472 links to MP3.com should say something about the digital music scene )

    --

    Davis Ray Sickmon, Jr - looking for something to read? Check out my three free novels at MidnightRyder.org

    1. Re:Apache Dominates... by Anonymous Coward · · Score: 0

      umm, yeah right - simple market share stats validate OSS development methods. i guess i'll go back to windows 95 & ms office for all my needs (other than web servers, of course)

  22. MS uses Inktomi uses Sun by djKing · · Score: 1
    see: MS Press Release and Inktomi Server Cluster

    Stuff like that make me smile ;)

    -Peace
    Dave

    --
    Free as in "the Truth shall set you..."
  23. Did they bump the count to extraghost.com? by foolishj · · Score: 2

    So were there three links to www.extraghost.com before they wrote the page, or after? And which one of the band members works at Inktomi? And will it be four after I post this comment?

  24. 1,000,000,000+ and what do we have? by Maul · · Score: 1
    I'm sure we've all noticed that there are many great pages out there, pages like this and others that provide good hubs of communication, and also several good personally run sites that have good information, or are at least entertaining.

    Also note that while these pages exist, there is also a lot of random crap out there that really just wastes space and time. As the number of pages increases, I'm sure that it will be harder and harder to find quality documents among the wasteland of stuff we don't need.

    "You ever have that feeling where you're not sure if you're dreaming or awake?"

    --

    "You spoony bard!" -Tellah

    1. Re:1,000,000,000+ and what do we have? by British · · Score: 1

      A lot of those web pages unfortunately are those cheap-o templates made by Angelfire users(the shopping list,etc) and have absolutely no original content whatsoever, and are never updated. I say we get rid of 'em

    2. Re:1,000,000,000+ and what do we have? by Moofie · · Score: 1

      Good thing you don't run the Internet. Why do you care? Are they hurting you somehow? Leave the poor schlocks alone. Everybody's got to start somewhere.

      --
      Why yes, I AM a rocket scientist!
    3. Re:1,000,000,000+ and what do we have? by Anonymous Coward · · Score: 0

      Some people have jobs. COUGHCOUGH

  25. suspicious by annarchy · · Score: 1

    These results seem a little strange to me, there is explaination or context for the results.

    why did they list the number of links to rickymartin.com or cooking.com

    why did they list the longest url as a nonworking URL that probably used to spam the search engines?

    oh great, uh hey guys, today I have determined there are 1 billion webpages!

    1. Re:suspicious by chrischow · · Score: 1
      > why did they list the number of links to rickymartin.com or cooking.com

      its called an example

  26. Thaaat's great... by Greyfox · · Score: 4
    Now INDEX it.

    Finding information on the web is going to increasingly be like trying to find hay in a needle stack. Already the current indexing engines can't keep up, and you have unscrupulous web authors putting bunches of keywords unrelated to their site in their meta tags to insure that they get mentioned in every single search. Some indexing engines already ignore meta tags for that reason. And how many times have you tried Altavista, Excite or Google only to find that the page you're trying to get to has expired or is 8 years old and hasn't been changed in 7?

    This issue is going to have to be addressed, because the web is going to continue growing.

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

  27. So... has anyone figured out how many monkeys? by szyzyg · · Score: 1

    An almost infinite number of monkeys bagning away on a similar number of typewrites will eventually reqproduce the works of shakespeare.

    The internet disproves this hypothesis.

    But seriously - has anyone figured out how long it would take to requroduce certain random documents? - such as the works of shakespeare?

    1. Re:So... has anyone figured out how many monkeys? by Anonymous Coward · · Score: 0
      It isn't possible to figure it out, since, given truly random input it can't be proven that it will eventually produce any given output, just what the chance of it occuring in a certain number of attempts is.

      As for reproducing Shakespeare.. First you have to decide on how exact you want it. Are spelling mistakes accepted? Should it be on a specific format? Is correction accepted?

      Then you have to calculate the total number of possible legal versions.

      Then you have to calculate the total number of possible combinations.

      Then you have to calculate how long it takes the monkeys to actually hit a typewriter key on average, and decide on what you'd do with regards to breaks, feeding, replacing broken typewriters, and calculate how that would affect the time.

      At that point, it's an easy matter to calculate how long you must give them for the likelyhood rising above a certain threshold.

      But that would be the likelyhood... It could happen almost right away, or not at all.

      But possible? Sure it is. But I doubt you'd live long enough to be able to verify whether they succeeded or not ;)

    2. Re:So... has anyone figured out how many monkeys? by cobyrne · · Score: 1

      An almost infinite number of monkeys bagning away on a similar number of typewrites will eventually reqproduce the works of shakespeare.

      An almost infinite number of monkeys banging away on a similar number of typewriters will create...

      ... one hell of a mess!

    3. Re:So... has anyone figured out how many monkeys? by Anonymous Coward · · Score: 0

      http://www.google.com/search?q=shakespeare indicates that us monkeys have managed to pound out some 114,000 pages of Shakespeare.

    4. Re:So... has anyone figured out how many monkeys? by cpt+kangarooski · · Score: 1

      "It was the best of times, it was the blurst of times?!? Stupid monkeys!"

      --
      -- This and all my posts are in the public domain. I am a lawyer. I am not your lawyer, and this is not legal advice.
  28. unique? by Signal+11 · · Score: 4
    Well, yahoo has hundreds, nay thousands, nay hundreds of thousands of "uniquely indexable pages" in their database. It's a web of links. How does one define unique?

    Really, this article says nothing. Unless it states (and it does not) *exactly* how they mean "unique" I'm not going to take this seriously. A more interesting statistic (and one I haven't seen updated in awhile) would be what the information conversion ratio is between the "RealWorld" and the web - ie: how much information that you can find in a library can you also find online in it's entirety. That is a more accurate measure of growth than raw page numbers.

    1. Re:unique? by TerraAlien · · Score: 1

      A more interesting statistic (and one I haven't seen updated in awhile) would be what the information conversion ratio is between the "RealWorld" and the web - ie: how much information that you can find in a library can you also find online in it's entirety.

      This is an interesting measure, but I'd say if you used it as a primary measure, you would miss a lot for two reasons. First, there is a plethora of extremely useful information on the web that wouldn't exist in written form except for the web. Take the guy who likes to make bows and arrows the 'old-fashioned way,' and wants to share his collection and how others can do it too. If the web didn't exist, he probably wouldn't ever do that, because of the prohibitive costs of publishing, and the fact that while he has a few pages worth of information, he doesn't have a books worth.

      OTOH, while it would be really cool to have The Hobbit in electronic format, so I could download it onto my Palm, I personally don't want to curl up with my laptop to read a good book. The same goes for a lot of reference works, too. While it would be great to read the shop manual for my car online, if I'm going to work on it, I want to be able to turn pages without worrying about getting the keys greasy.

      In the end, I think it comes down to this: the web isn't the real world, and it never will be (even if you believe Necromancer is a prophetic book ;-). The uses for a library and the web will overlap, and consolidation will happen, but one will never replace the other.

      Course, now that I've said that, I'll temper it by saying that personal book-binding machines will be cool. The ability to take a traditional work in electronic form off the web, and have it in a form I can stretch out with on the couch in an hour is enticing. Seems like a real waste of paper, though.

      "Real life" makes a good starting line, but lets not measure the race by it.


      -----
  29. 1 Billion useless pages. by Dast · · Score: 4

    49.5% Broken links to mp3s
    49.5% pr0n pages with javascript popups
    1% other

    We humans should be so proud of ourselves.

    :)

    --

    This sig is false.

    1. Re:1 Billion useless pages. by pnevares · · Score: 1
      49.5% pr0n pages with javascript popups


      And to expand on that statement....each of those popups is another page adding to the "1 billion".

      So the ratio is like 1 pr0n page to 15 popups! =)



      Pablo Nevares, "the freshmaker".
      --

      Pablo Nevares, "the freshmaker".
    2. Re:1 Billion useless pages. by MS · · Score: 1
      That's still 10.000.000 "other" pages left.

      At a speed of one page per minute, it will take the rest of my life to read them all (about 57 years, considering that I can't read more than 8 hours a day: I'll also have to eat and sleep, ...).

      :-)
      ms

  30. Re:And at least one of them already comments on th by DjMau · · Score: 1

    How can 90% of Internet content be crud if over 50% of it is p0rn ;)

  31. Re:And at least one of them already comments on th by DjMau · · Score: 0

    How can 90% of Internet content be crud if over 50% of it is p0rn ; )

  32. Always remember... by Mezz · · Score: 1

    ...there are three types of lies:lies, damn lies, and statistics. Take from that what you will. BTW, 90% of everything is CRAP (or crud...or even s#|%)

  33. Always remember... by Mezz · · Score: 1

    ...there are three types of lies:lies, damn lies, and statistics. Take from that what you will. BTW, 90% of everything is CRAP (or crud...or even s#|% depending on your frame of mind at any given time;^D)

  34. A public or private search engine? by dattaway · · Score: 2

    They say they are the world's largest search engine and I get many hits spanning my pages from *.inktomisearch.com, but how do you search their site?

    Is inktomi publicly searchable? If it is not, then my pages wouldn't be publicly searchable. So, what's the point of them making connections to my sites?

    Is the following how you ban a site from your server?

    /etc/httpd/conf/access.conf
    #deny from domain

    1. Re:A public or private search engine? by davew · · Score: 1

      Inktomi, last I looked, don't run a search engine site; they develop the tech and license it to others who get involved in the messy business of making a popular search engine site.

      IIRC, their highest profile customers are Hotbot, who used Inktomi from the start, and Yahoo!, who switched from Alta Vista to Inktomi. Inktomi is a more neutral backend for Yahoo!, who are competing in the same market as Alta Vista.

      Dave

      --

    2. Re:A public or private search engine? by Gord · · Score: 1

      You probably are searching italready, check out this page to see what sites are powered by Inktomi. --

    3. Re:A public or private search engine? by Steven+Pulito · · Score: 1

      I beleive the Inktomi software and database of links is the guts behind a couple of the search engines out there. If my memory serves me correctly their technology powers www.hotbot.com

    4. Re:A public or private search engine? by Hall · · Score: 1
      How to keep Inktomi from indexing your site

      First, why do you not want them to index your site ?

      Second, if you've read the other replies to your question, you might want to re-consider...

      Finally, I believe the all search engines will ignore you if you do the steps they give. That is, if they follow the rules.

    5. Re:A public or private search engine? by JoeBuck · · Score: 2

      Inktomi sells their technology to other companies; they don't operate a search engine under their own name. HotBot is Inktomi-based; there are others as well but I don't know who.

    6. Re:A public or private search engine? by dattaway · · Score: 2

      Thanks for the good page with all the answers. It wasn't immediately obvious how to search their web site. You see, I get hundreds of entries for inktomi.com and others over my 56K dialup. Naturally, I'm curious to see what they are and do a www.them and didn't see anything useful until now.

    7. Re:A public or private search engine? by jfunk · · Score: 2

      I just checked out hotbot to see if any of my sites (which are constantly hammered by the Inktomi crawler, as I'm sure is the case with most sites) would come up.

      No hits.

      Google finds them, though.

      Something's definitely amiss regarding Inktomi.

    8. Re:A public or private search engine? by JoeBuck · · Score: 2

      Hotbot uses Inktomi technology. They don't use Inktomi's database (I don't know who does).

    9. Re:A public or private search engine? by jfunk · · Score: 2

      Hotbot uses Inktomi technology. They don't use Inktomi's database

      Ahh, I see now. They are crawling my sites but not letting anybody search the results unless they pay big bucks.

      Hmmm, looks like I'll be making a modification to my robots.txt files and possibly adding some new rules to my firewall.

      I should be allowed to find out what info about my sites they are trying to sell. If I can't, they won't be getting access.

  35. Or from the late Carl Sagan by Dast · · Score: 1

    Billions and billions of pages lost in the cosmic consciousness...

    --

    This sig is false.

  36. *Sarcasm* Gee Wiz by Slak · · Score: 1

    According to Netcraft (http://www.netcraft.com/survey/), "In the December 1999 survey we received responses from 9,560,866 sites". If each site has 1000 pages (not terribly unreasonable) we're at 9.5 billion, nearly 10 times more than this PR-plug. And this is only counting static pages; my guess is that auctions on eBay do not count. I wonder if they count Deja - how many pages do you think they have in all those news groups?

    The Internet is large. Leave it at that.

    Cheers,
    Slak

  37. Creepy - by NoizAngel · · Score: 1

    Almost 4000 links point to rickymartin.com -
    I'm just curious if that was supposed to be impressive or disturbing. Of course, a good lot of those one billion pages are made by teenagers so-

    -Noiz,
    Who thinks Ricky Martin looks too much like a clone to be a "hottie".


    ---------

    --

    ---------
    I'd kill for a Nobel Peace Prize.
  38. Offtopic: Dilbert/Ratbert by Anonymous Coward · · Score: 0

    Haha! I wonder how many have noticed?!?!?

  39. Inktomi false advertising? by re-geeked · · Score: 1

    Did I fall asleep for 20 years, or are Inktomi's claims about its search software a little inflated? They stop just short of claiming to read my mind and provide the doc I want as soon as I open the browser.

    Someone please tell me if I'm missing some great coolness here. After all, I haven't used anything other than Google for months.

    --
    "You can't get something for nothing." - my grandfather, on the stock market and Reaganomics.
  40. only a billion? by Bad_CRC · · Score: 1

    seems that I've been on websites that appeared to have more pages than that.

  41. 1 Billion no phone by djKing · · Score: 1

    There are still a billion folks in the world who haven't even made a phone call.

    -Peace
    Dave

    --
    Free as in "the Truth shall set you..."
    1. Re:1 Billion no phone by Anonymous Coward · · Score: 0

      I believe the correct figure is that only one billion people have made a phone call. Most of the world is quite poor.

    2. Re:1 Billion no phone by Anonymous Coward · · Score: 0

      Make that: about 85% of the people on this planet have never made a phone call.

  42. Re:And at least one of them already comments on th by DaveHowe · · Score: 1
    How can 90% of Internet content be crud if over 50% of it is p0rn ;)
    because 90% of online p0rn is crap too :+)

    erm

    or so I am told :+)
    --

    --
    -=DaveHowe=-
  43. But what percentage of web pages are pr0n? by Anonymous Coward · · Score: 0
    Geek: "I invented a way to download porn from the internet one million times faster!"

    Marge: "Does anybody really need that much porn?"

    Homer: "Mmmmm.... one million times.... aaaagggggghhhhh!!!!!!!!!!"

  44. Did anyone else notice the language bias? by gordie · · Score: 1

    Are they trying to claim that all pages are in English, French or Dutch? What does this indicate as to the rest of their research? I would have thought that the number of pages in Russian (Cyrillic) or one of the eastern languages such as Korean or Japanese, would have been statistically significant enough for inclusion. Makes me wonder about the validity of any of their numbers.

  45. RedHat? by RenQuanta · · Score: 1
    From their details page:


    Apache 60.33%
    Microsoft-IIS 25.26%
    Netscape-Enterprise 3.79%
    Rapidsite 2.07%
    Lotus-Domino/Release 1.47%
    thttpd 1.37%
    WebSitePro 1.21%
    WebSTAR 0.93%
    Zeus 0.76%
    Stronghold 0.71%

    NCSA 0.47%
    CnG 0.34%
    BESTWWWD 0.34%
    Concentric 0.29%
    Roxen Challenger 0.20%
    Red Hat 0.17%
    mod_perl 0.16%
    tigershark/0.9.8-IC 0.13%


    Since when is RedHat a webserver and not a distribution? I'd like to know the method these guys used to get these stats, and why they listed Redhat as a server.

  46. Recount by Jupiter2 · · Score: 2

    Inktomi and NEC Researcher: "Oh no!!! I can't remember if I counted our own web page. ARRRGGGHH!!! 1, 2, 3, 4, 5, ................."

  47. MIrc has 10 registered users by fr0g · · Score: 0

    now this is news!

    http://www.ircnews.com/mirc.html

    1. Re:MIrc has 10 registered users by Anonymous Coward · · Score: 0

      I believed you for a minute there. (me gullible)

      I wrote some windoze shareware that was NOWHERE near as popular as mIRC, and I grossed almost 100K from that over a couple years. mIRC must be way above that.

  48. Inktomi vs. Google by sylvester · · Score: 1

    From the press release:
    "By examining the entire Web and analyzing the billions of links between all of its documents, Inktomi can distill an index of the highest quality documents to provide users with
    more relevant and intuitive results."

    Isn't that the "technology" that google has patented?

    1. Re:Inktomi vs. Google by markpapadakis · · Score: 1
      Actually, yes... Seems like Inktomi is borrowing ideas and technology from other rivals ( Google and DirectHit ) although Inktomi results suck big time. All that technology stuff they write in their page, is just that; crap.. Try searching on Yahoo! for something and then switch to Web Pages view to see. Google rocks. Inktomi sux.

      Mark Papadakis, WebDeveloper

      --
      Technology ramblings : Simple is Beautiful
  49. Using subdomains instead of directories! by Anonymous Coward · · Score: 0

    Actually, this is an interesting idea. Replacing wach page on a web site with its own subdomain. I like it! All right DNS, let's see what you've got!

  50. UK or US? by Nodatadj · · Score: 2

    IE
    1,000,000,000 (US)
    or
    1,000,000,000,000 (UK)

    There's a large difference.

    1. Re:UK or US? by Anonymous Coward · · Score: 1

      Try using milliard.

  51. Use Google by JoeBuck · · Score: 4

    Google is one of the best search engines available for most purposes, because it ignores meta tags, and scores pages higher based on links to the site from other high-scoring pages (this is a recursive definition but the recursion bottoms out).

    The result of this is that it gives useful results even when very common words are used. Try searching for Linux on Google. The first ten results are

    • linux.org
    • linux.com
    • www.debian.org
    • www.linuxworld.com
    • linux.davecentral.com
    • www.varesearch.com (VA Linux)
    • linux.corel.com
    • www.li.org (Linux International)
    • lwn.net (Linux Weekly News)
    • www.linuxhq.com

    While a human being might be able to come up with a better list, a machine came up with that list, based solely on the structure of the web. (I wonder why linux.davecentral.com rates so high -- possibly because it's attached to a high-ranking site, davecentral.com).

    ObAdvocacy: and Google runs on Linux.

    1. Re:Use Google by Zule_Boy · · Score: 1

      On the topic of google, try doing a search for "More evil than satan". Some of the top ten hits should make you laugh.

      --Evan

    2. Re:Use Google by Anonymous Coward · · Score: 0

      hah, that's funny - now try this

  52. Re:And at least one of them already comments on th by Anonymous Coward · · Score: 0

    The Dilbert Zone is using the wrong symbol for Red Hat on the Ratbert Index.

    ...just to see how much free advertising The Dilbert Zone can get from it.

  53. Hmmmmmm by haus · · Score: 1

    And I thought that cable television was a vast waste land...

    all persons, living and dead, are purely coincidental. - Kurt Vonnegut

  54. Hmmm...4 Billion pages... by shaldannon · · Score: 1

    And of those 4 billion, probably 1 billion are on AOL and another billion are on yageohooties (Yahoo+Geocities)/angelfire/dragonfire/.../. This means that at a reasonable guess, a minimum of half of the pages on the net consist of a purple background (or image) with lime green text, broken html, and a couple dozen animated gifs reminiscent of a carnival (and no content beyond "Hi, my name is _____ I was born in _____, my drivers license, SSN, and major credit card are _____, _____, and _____.").

    Geez....I say that there are far too many people on the net who just don't belong, and freedom of speech or no, some people shouldn't be allowed to make web sites.


    Who am I?
    Why am here?
    Where is the chocolate?

    --


    What is your Slash Rating?
  55. Hear Hear!! by Anonymous Coward · · Score: 0

    This man is correct! The correlation between making associations that do not exist, and poking another man's anus, is extraordinary!

    1. Re:Hear Hear!! by jams757 · · Score: 1

      Sorry, I wouldn't know anything about poking another man's anus.

  56. One billion documents in the Inktomi index by Anonymous Coward · · Score: 1

    Inktomi are an american company - one billion is a thousand million. That's the number of docs they have in their index. Inktomi was around a long long time before google, and their technology is a rather cool cluster based one. It currently runs on Solaris for their search. Part of the "Battle" in the search market is on the size of the index that people store. Inktomi are currently trying to leapfrog their competitors (Altavista et al.), which they have done nicely. Most people have at some time or another used Inktomi's seach indirectly through hotwired.com yahoo.com or one of the many other portals Inktomi power. As to "Other languages" - Inktomi are a multinational corporation providing services in japan (goo.com) and a lot of European and South American countries.

  57. Impressive Marketing statistics by henley · · Score: 4

    Well, my take from the site that what they're actually saying is "Look at our lovely indexing cluster. It can index 1 billion web thingies! Shouldn't you be buying an search engine product that powerfull?

    Or, in other words, it's another example of meaningless statistics spewed in the name of marketing, vaguely covered-up as serious research.

    References: Car MPG & top speed figures vs actual usage, Processor MHz as function of system throughput, quoted battery life as function of laptop utilisation, quaketest FPS compared to average internet multiplayer experience etc etc etc...

    --

    --
    I'd rather have a bottle in front of me than a frontal lobotomy
  58. Infinity by David+A.+Madore · · Score: 2

    Hair splitting alert ON.

    The number of (different) pages on the web is actually infinite. Here is a sample infinite component.

    (Actually it's finite because the maximal accepted length for a URL is finite. But it's way above the billions.)

    Note that these are not dynamical pages. Dynamical pages (i.e. pages whose content changes for the same URL) don't count: they're cheating.

    (The source used to generate this infinite number of pages is available under the GPL.)

  59. In related news... by jd · · Score: 2

    The one billion documents were found to be a plot by The Cult of Arthur C Clarke to end the Universe - each page having a unique name of God on it.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  60. That's search engine trickery... by bero-rh · · Score: 1

    Many search engines score pages higher that have the keywords in the hostname, so creating tons of subdomains to get every possible keyword into the hostname might actually get the page in top positions for several keywords.

    Guess it's time someone anti-microsoft gets microsoft.ms.windows.windows2000.windowsnt.office. office2000.2000.windows95.windows98.mswi ndows.mswindows2000.sucks.org. ;)

    --
    This message is provided under the terms outlined at http://www.bero.org/terms.html
  61. Re:the real question is... by Anonymous Coward · · Score: 0

    In my opinion, 1 Natalie Portman site is one too many.

    Vivent longtemps la résistance de Natalie Portman!!

  62. Hasn't Apache always been #1? by Anonymous Coward · · Score: 0

    I've seen numerous stats on the different types of servers out on the net serving web pages, and I've never seen one when Apache was even close to being taken over by another type of server, was this a surprise to anyone?

    E.

  63. Good for the WWW, but where's my damn page??? by Gray_Wolf · · Score: 1

    Congratulations to the WWW and this accomplishment (Those of you who run or are members of porno sites: You don't deserve it). However, with the increasing number of homepages on the internet, who in the heck is gonna keep track of them all?? I am fully aware of the different directories and search engines, but they have such stringet rules for Link Submission, it discourages many newbies from starting a webpage. I am also fully aware of the necessity of the META tags required, but then there's also many other criteria that I don't think anyone is really aware of. I have repetedly e-mailed Yahoo! and Excite for or if they have a criteria list for homepage submission, only to wind up with a reply from an automated service, then never hear from them again!! Luckily, my news of my webpage gets around Via word of mouth, not on some search engine, but I'm going to change that.

    But keeping track of all these billions of pages, will be difficult, and sooner or later, people are going to demand satisfaction! (Slap me with that glove again, and I'll give you satisfaction, in a .223 caliber!!)

    The Gray Wolf

    --
    My 80286 is like the Bible: I swear by it every night when I try to run something.
  64. 80% of all statistics are false! by Anonymous Coward · · Score: 0

    .

  65. Re:And at least one of them already comments on th by unDees · · Score: 1
    How can 90% of Internet content be crud if over 50% of it is p0rn ;)

    because 90% of online p0rn is crap too :+)

    Do you mean fecofilia, or just low quality? *impertinent smirk*

    --unDees

    --
    "I call a baby goat a 'goatse.'" -- my non-Internet-savvy 6-year-old stepdaughter
  66. Re:First Post by Anonymous Coward · · Score: 0

    AND SECRET SAUCE HGALHGLAGHLGHAGLHAGLHAG

  67. My idea, you can't patent it by dsplat · · Score: 2
    Okay, it isn't new and it isn't my idea originally, but I'll put a new spin on it. Is there a new for a moderated index to the most useful stuff on the web? Hey andover.net, I'm talking to you too. An index to everything open source related would be great. After all, an index to the whole web is a huge project that never ends and eventually sucks up all your free time. But it may be useful to have moderators rate the links on two factors:

    1. General usefulness of the information on the page/site. Good stuff is good, no matter how you got there.
    2. Specific applicability of the index to the page. Getting to the wrong good stuff or seeing too many links for a particular idea doesn't help.


    I'm willing to help moderate on some subjects.
    --
    The net will not be what we demand, but what we make it. Build it well.
  68. just a thought by hal9000 · · Score: 1

    hmm...
    so that means that if each and every page on the WWW were worth $100, then it would equal bill gates' pocket.
    that's nuts

    --
    Look out honey, 'cause I'm using technology; Ain't got time to make no apology
    1. Re:just a thought by Anonymous Coward · · Score: 0

      haha

      // MICROS~1 code
      if( page_on_www <= 100 )
      {
      purchase_possible = true;
      EmailBill( "You can buy the WWW" );
      }

  69. what a cheap plug by Claude+Debussy · · Score: 0

    >rites "A new study by Inktomi and NEC Research >Institute show that there is at least one billion >unique
    > indexable Web pages on the internet. The >details are pretty interesting; for example, >Apache dominates the server market. "

    Is Apache paying Cmdrfucko to say shit like this? No need to mention Apache dominates the server market, we already know that, thank you..

  70. Large but Finite number of monkeys by Greyfox · · Score: 2
    I believe the original thought experiment calls for an infinite number of monkeys. It does not say anything about the infinite volume of monkey shit that would be produced over the course of the experiment.

    The Internet does not represent an infinite number of users (at least, not yet) but you're still more likely to get an infinite volume of monkey shit out of it while you try to dig up the works of Shakespere.

    Or you could save time and go here.

    --

    I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    1. Re:Large but Finite number of monkeys by Anonymous Coward · · Score: 0

      Yeah and given infinite time you could get the entire works of Shakespere to appear spontaneously - and not have to worry about feeding or cleaning up after an infinite collection of monkeys (thermodynamics does not perclude spontaneous decreasing entropy - just assigns low probability to it). Of course the mean time to Shakespere is ungodly huge, but no monkey shit.

  71. That's true. by Dast · · Score: 1

    I guess there is something to be hopeful about.

    :)

    --

    This sig is false.

  72. WWW is nice, but.... by Destacona · · Score: 1

    Gopher is better!

    Gopher doesn't clog up your screen with complex and confusing "images." Gopher cuts through all that meaningless drivel and gives you your information crystal clear.

    WWW's search engine naming conventions make them difficult to remember? Why is this? There isn't one, that's why! What does "altavista" mean to you? What about "lycos"? Sounds like a cough serum, doesn't it? "Yahoo"? Wie doggie! Gopher is different because most of Gopher's search engine names rely on a standard instead of incomprehensible gibberish. This standard is, of course, the Archie comics from your childhood (or perhaps your second childhood). You've got all your ol' pals Veronica and Jughead ready to surf the Goph and give you some friendly advice on where to go next. Gopher is truly better than the WWW!

    God bless Gopher, each and every one!

  73. what to expect by Anonymous Coward · · Score: 0

    watch ms go down even more and apache go up, for the same reason the pentagon got rid of ms and went up with apache.

  74. Re:And at least one of them already comments on th by Anonymous Coward · · Score: 0

    Actually, Redheads, Inc. ain't doin' too bad!

  75. s/n ratio? last update? press releases... by xTown · · Score: 1
    10^9 is great, but what's the signal to noise ratio? How many of them are "Hay I am 133t d00dz give me warez and pr0n!"? I mean, the s/n ratio on the Web is pitiful anyway...

    And how many of those one billion web pages are actively updated? I quit a job with (unnamed employer) a year and a half ago, and nobody has updated the Web server there since I left. The only reason it no longer has my name on it is because I changed the contact info...but the content is completely unchanged.

    Finally, this is a press release. Press releases are written by the companies or commissioned by companies and distributed to news agencies which usually don't bother to do any but the most basic redaction. It's like free advertising. As I believe someone else pointed out, this isn't a "hey cool, a billion Web pages," it's "hey, our indexing software can index a billion pages, don't you want to buy it?". Always be wary of the source...usually, if you read something positive about a product or a technology, you can bet that somebody is getting paid for it. (Yes, this includes reviews...remember, we have to keep the advertisers happy.)

  76. Web Antiques by selenakyle · · Score: 1

    I would have liked to see what the oldest pages were. You know, NO updates since 1995, or whatever? Web senior citizens.

    :-)
    sk

    1. Re:Web Antiques by otis+wildflower · · Score: 2

      Check out Ghost sites...

      Your Working Boy,

  77. What are the odds... by Anonymous Coward · · Score: 0

    ...that the 1,000,000,000th document was porn? One chance in five? One in three?

  78. pointless post? by RoLlEr_CoAsTeR · · Score: 1

    So, knowing all of this, why did /. post this?

    Makes me wonder........

    --

    Insert mind here.
  79. Clinton's Proclimation on the Event by Black+Art · · Score: 1

    And in other news...

    President Clinton has proclaimed the One Billionth web page to belong to a young Bosnian orphan of indeterminate gender.

    --
    "Trademarks are the heraldry of the new feudalism."
  80. Re:Use Google [it cheats] by Anonymous Coward · · Score: 0
    While a human being might be able to come up with a better list, a machine came up with that list, based solely on the structure of the web.

    no it didn't.

    I work for one of google's competitors and we did *exactly* what they claim they are doing and got completely different results. They apparently are crawling sites like yahoo and dmoz and using positions there to effect their ranking...

    google also now uses RealNames[tm] which desn't run on linux, and as an arm of the old bad intellectual property junta

    don't believe the hype...

    these comments do not represent those of my employer. in fact, I'd probably get busted if they found out I did this!

  81. The key is "indexable" by Kozz · · Score: 1

    dynamic content makes the technical quantity of distinct "pages" far greater than a billion.

    Surely you are correct. However, the operative term in that phrase is "indexable". I'm quite sure that neither Inktomi nor many other "spiders" such as AltaVista (to name one big one) can traverse links to dynamically generated pages. So even if the number of indexable pages is over one billion, that indeed leaves much content out of the big picture.


    Quidquid latine dictum sit, altum viditur.

    --
    I only post comments when someone on the internet is wrong.
  82. Re:Use Google [it cheats] by cherub · · Score: 1

    Okay, I'll admit I'd be happier if a *machine* had determined that yahoo and dmoz were worth crawling for ranking information, but let's face it, that's pretty hard to do.

    The fact remains that google outdoes every other search engine on the net, and returns useful links for obscure queries that would be a lost cause on other search engines.

  83. 1 billion dead links? by Tokyo+Joe · · Score: 1


    Ok, so they have a lartge database of links. These links point to pages that were online once, and may in fact be on line now.

    So! The web has grown. But it's grown like an algee (SP?) across a fish pond. Some of it is usefull (for fish food) but most of it's a waste of space. An example, I have been looking all day for deck plans for the Vasa with out luck.

    Plenty of mediteranina cruise liners with deck plans. It's the same for most searches (unless you are looking for porn, good strike rate there)...

    1 billion pages, my local library is more offten a better source of information... Usenet is more flames and spam than usefull chatter. The Net is becoming a has been, the golden age has past us allready, sure video and audio streeming seem cool, but I already have a TV and stereo. What I want is a world class library, free and at my fingertips.

    --
    Tokyo Joe
  84. Another (far more important) angle.... by Anonymous Coward · · Score: 0

    You know what that means? Assuming, on average, one web page per individual, that means that 4/5th's to 2/3rds of the world's population does not yet have A WEB PAGE!!!! It also means that most or all of "The Internet" is controlled by only 1/5-1/6th of the world's population?!!!! Think about that before all of you web developers start patting yourselfs on the back. It's the socially responsible thing to do ;)

  85. Re:Use Google [it cheats] by Anonymous Coward · · Score: 0
    I work for one of google's competitors and we did *exactly* what they claim they are doing and got completely different results.

    How do you know that you are doing "exactly" what Google does? Do you have the Google source? Maybe your search engine is just crap and stealing a few features from Google isn't enough to make it good.

  86. One billion channels and nothing on ... by fable2112 · · Score: 2


    Just another for-all-practical-purposes-meaningless statistic to nonetheless feel overwhelmed by, I suppose.

    If there were a billion pages to look at, I don't know when I'd have the time to do anything else, being the info-junkie that I am. Fortunately, a sufficient quantity of these pages do not interest me. :)

    Then, too, I wonder how many of these pages are de facto duplicates? ("Department of redundancy department, redundant division speaking ...") For instance, I'm right in the middle of moving my pages off of geocities and onto drak.net. At the moment, the pages that I've put up on drak.net that were part of my old geocities page still exist on geocities because I'm not done moving everything yet, and can't shut down my old page until EVERYthing is transported. I went through a similar process when I moved TO geocities from my college web page two and a half years ago.

    That also makes me wonder more about this statistic. Are there one billion ACTIVE pages, or merely one billion pages that have ever existed? If the former, how many pages have ever existed? That would be an interesting question ....

    Well, by making this post I'm probably creating yet another page and adding to the noise and confusion. Consider it my chaotic deed for the day. :)

    --
    "Somebody exploded a letter-bomb today ... but it wasn't anybody I knew" -The Moody Blues, "Dear Diar
  87. leet0 by serialk · · Score: 1


    cornz all the way !

    moderate this and die biatch !!?

  88. "But these go to eleven!" by fidel · · Score: 1

    Heh. Is it any better for the 1e9 pages? ...

    "Well, its one louder, i'n it?"