New Google Search Index 50% Fresher With Caffeine
Ponca City, We love you writes "When Google started, it would only update its index every four months. Then, around 2000, it started indexing every month in a process called the 'Google dance' that took a week to 10 days and would provide different results when searching for the same term from different Google data centers. Now PC World reports that Google has introduced a new web indexing system called Caffeine, which delivers results that are closer to 'live' by analyzing the web in small portions and updating the index on a continuous basis. 'Caffeine lets us index web pages on an enormous scale,' writes Carrie Grimes on the official Google Blog. 'Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day.' Now not only does Caffeine provide results that are 50% fresher than Google's last index, adds Grimes, but the new search index provides a robust foundation that will make it possible for Google to build a faster and more comprehensive search engine that scales with the growth of information online."
The Caffeine project is approved. The system goes on-line June 9th, 2010. Human decisions are removed from search engine results. Caffeine begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.
My blog
Caffeine takes up nearly 100 million gigabytes of storage in one database
A million gigabytes is what we call a petabyte.
Pretty good is actually pretty bad.
They've developed their own.
Pretty good is actually pretty bad.
I miss the days when Altavista was king (purely nostalgia, I assure you). I don't, however, miss getting marked down in Spanish class due to using BabelFish -_-;;
This reminds me of one of my funniest memories from middle school: The Spanish teacher hands back a paper with a big red "F" on it to the guy sitting in front of me. She says: "This is very good.....But, it's in French"
Back in the day, refreshing BabelFish would cause the options to default back to English->French.
-- If you try to fail and succeed, which have you done? - Uli's moose
Google has pulled my site robots.txt file 32 times this month and it is only the 9th - about 4 times a day. I'm showing almost 2000 web pages pulled by Google indexers in this same time period. My site is tiny, private, not very large.
By bandwidth, Google is only 2.4% of the total site traffic, so far, this month.
I agree Google is "fresher" than they used to be. OTOH, my non-commercial site has approximately doubled readers in each of the last 6 months by publishing 1 new posting about every other day.
I suspect other, more use sites are hit hourly or even more often by google.
MSN-Bot appears to visit 10 times a day, but is much more selective about which pages it indexes. Since my site is date organized, this seems smarter than what google does. Some times, I do edit older stories with new knowledge or corrections which google will see, eventually and MSN will not. Zero referrals from any microsoft searches seen.
Yahoo! slurp barely touches my site. Only 1 referral has been seen.
Google sends about 30% of the total traffic, but most is from social networking with "hey, check this out" type referrals. Not bad for a technical article site.
Google dance if you want to,
If it helps you search online.
MSN don't dance,
and if they don't dance,
well they're no search engine of mine.
Rules of Conduct:
#1 - The DM is always right.
#2 - If the DM is wrong, see rule #1