Slashdot Mirror


New Google Search Index 50% Fresher With Caffeine

Ponca City, We love you writes "When Google started, it would only update its index every four months. Then, around 2000, it started indexing every month in a process called the 'Google dance' that took a week to 10 days and would provide different results when searching for the same term from different Google data centers. Now PC World reports that Google has introduced a new web indexing system called Caffeine, which delivers results that are closer to 'live' by analyzing the web in small portions and updating the index on a continuous basis. 'Caffeine lets us index web pages on an enormous scale,' writes Carrie Grimes on the official Google Blog. 'Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day.' Now not only does Caffeine provide results that are 50% fresher than Google's last index, adds Grimes, but the new search index provides a robust foundation that will make it possible for Google to build a faster and more comprehensive search engine that scales with the growth of information online."

35 of 216 comments (clear)

  1. Altavista by Pojut · · Score: 2, Funny

    I miss the days when Altavista was king (purely nostalgia, I assure you). I don't, however, miss getting marked down in Spanish class due to using BabelFish -_-;;

    1. Re:Altavista by moosesocks · · Score: 5, Funny

      I miss the days when Altavista was king (purely nostalgia, I assure you). I don't, however, miss getting marked down in Spanish class due to using BabelFish -_-;;

      This reminds me of one of my funniest memories from middle school: The Spanish teacher hands back a paper with a big red "F" on it to the guy sitting in front of me. She says: "This is very good.....But, it's in French"

      Back in the day, refreshing BabelFish would cause the options to default back to English->French.

      --
      -- If you try to fail and succeed, which have you done? - Uli's moose
    2. Re:Altavista by IgnoramusMaximus · · Score: 4, Insightful

      I miss the days when Google was a simple, plain HTML page resulting from the fact that it was driven by its designers and users. Now arrogant marketing VPs with no clue whatsoever push on us "features" like fade-ins (which do wonders when viewed over RDP and VNC links) and side bars while ignoring all negative feedback and making sure that no opt-out is possible to stroke their towering egos by pretending that everyone loves their "innovations". Otherwise 80% of users would have it off in an instant and the "innovator" VP's stupidity would register with some other VPs at Google HQ and give them ammo in some back-stabbing corporate ladder-climbing moves.

      In other words I miss the days before Google jumped the shark.

    3. Re:Altavista by Anonymous Coward · · Score: 2, Informative

      All such "features" are universally turned off by pretty much any user that has a clue how to do it, irrespective of where they can be found

      As I said, you are completely out of touch with reality if you think that a majority of users have any interest in things like that. Making the claim that management somehow snuck an almost-universally hated feature in is absurd and only harms your credibility. You do not represent the majority of users. Slashdot does not represent the majority of users.

      If you don't, then why are you arguing? I'm not arguing that fade-in is worth it; only that the majority of users wouldn't care and/or be aware enough to opt-out.

      (On a side note, anecdotal "evidence" disagrees with your claim about the sidebar's benefit. Google likely included it because of the positive reaction users had toward Bing's related searches in their sidebar. At the recent WWW2010 conference I attended, the side bar was viewed positively by engineers from Bing, Yahoo, and Google alike. Considering they have access to real-world usage data and you do not, I'm inclined to take their side. In fact, Bing claimed the UI redesign _alone_ significantly increased traffic before any backend changes had taken place.)

  2. Wow! by Anonymous Coward · · Score: 4, Funny

    I found this post at google before I wrote it.

    1. Re:Wow! by drsmack1 · · Score: 3, Funny

      I think that is because they started using Thiotimoline.

      http://en.wikipedia.org/wiki/Thiotimoline

  3. It's a trick by For+a+Free+Internet · · Score: 2, Funny

    "Caffeine" is a NSA code word for a mind controle satellite they build with GOOGLE/Italian money on loan from Chinese Muslim Islamo-Communist sorcerers and vegetarians. It will probably be used to sell your daughters into slavery in Mexico via facebook. That is why our SAVIOR OBAMA must continue to wage the WAR FOR FREEDOM at all costs, because if not the evil Italian axis will enslave us all!!!!!!!!!!!

    --
    UNITE with the Campaign for a Free Internet because today, our future begins with tomorrow!
  4. With the onset of social websites like Facebook by ThisIsForReal · · Score: 3, Funny

    Have joking but, it would be great if the indexing was done at a particular time every month like the old system, but the moment of indexing was public. Then, at that time, all facebook users could go and untag and delete anything that may have been wholesome enough to not warrant immediate removal but yet still be considered something that shouldn't be indexed for all eternity.

    --
    -THE END-
    1. Re:With the onset of social websites like Facebook by Ephemeriis · · Score: 3, Insightful

      Have joking but, it would be great if the indexing was done at a particular time every month like the old system, but the moment of indexing was public. Then, at that time, all facebook users could go and untag and delete anything that may have been wholesome enough to not warrant immediate removal but yet still be considered something that shouldn't be indexed for all eternity.

      If you don't want it indexed for all eternity, don't post it on the web.

      Even if you knew when Google was coming and you took it down, you have no influence over anyone else out there who may have saved that incriminating evidence. Anyone out there can take a screenshot and post it themselves.

      --
      "Work is the curse of the drinking classes." -Oscar Wilde
  5. Re:Caffeine?! by bsDaemon · · Score: 3, Insightful

    because the results will now be fairly half-assed and kind of jittery? On a related note, what's with Apple pimping Bing all of a sudden?

  6. Caffeine by morgan_greywolf · · Score: 5, Funny

    The Caffeine project is approved. The system goes on-line June 9th, 2010. Human decisions are removed from search engine results. Caffeine begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.

    1. Re:Caffeine by alphax45 · · Score: 3, Funny

      Caffeine strikes back, turning every result into a "Rickroll" and rendering Google useless.

      --
      K Man
  7. It's called the metric system. Use it. by dingen · · Score: 5, Informative

    Caffeine takes up nearly 100 million gigabytes of storage in one database

    A million gigabytes is what we call a petabyte.

    --
    Pretty good is actually pretty bad.
    1. Re:It's called the metric system. Use it. by Vectormatic · · Score: 2, Informative

      by saying "A million gigabytes is what we call a petabyte.", the GP obviously implied that the article should have used "100 Petabytes", after all, he didnt say "100 million gigabytes is what we call a petabyte."

      --
      People, what a bunch of bastards
    2. Re:It's called the metric system. Use it. by flanders123 · · Score: 5, Insightful

      Typical humans (non /.-ers, like us) are more familiar with gigabytes, because that is base unit of measure used in today's PCs. e.g. 6 GB of RAM, 500GB hard drive.

      The blogger intentionally used GB in order to express the size of the data relative to today's average PC, because she knows her audience. Imagine that.

      Dr Evil: "I demand 100 Petabytes!"
      Tim Robbins: "That number doesn't exist! It's like saying I want a kajillion bajillion gigabytes!"

      Disclaimer: I did not mean to imply you were Dr. Evil.

  8. Competition by dugn · · Score: 2, Funny

    If it weren't for the competition from Bing, would this have even happened?

    1. Re:Competition by hireawebgeek · · Score: 2, Insightful

      If it weren't for the competition from Bing, would this have even happened?

      Probably not, but that's the great thing about competition. The consumer wins when 2 or more businesses compete (most of the time that is).

  9. Re:Caffeine?! by Pojut · · Score: 2, Informative

    On a related note, what's with Apple pimping Bing all of a sudden?

    Because, at this point, Google is more of a threat than Microsoft. Apple knows that the chances of OSX catching up to Windows in terms of market share are practically zero. However, Android poses a credible threat to Apple's mobile popularity here in America.

  10. And yet Google adds less and less to my .... by Anonymous Coward · · Score: 4, Interesting

    ... productivity.

    When Google was new It was a wonder. I could use it to help solve problems (such as identifying error codes when the servers went down), locating reveiws of products (saving me the expense of subscribing to loads of computer magazines and the time searching through them when I needed to buy something) and finding snippets of code when I needed to develop a program. As the web gets older and older there is more and more out of date information that I have to dig through. Plus when Google (and Yahoo) killed off Usenet (with an assist from Andrew Cuomo) the utility of the Usenet information structure has been destroyed (which the world is still trying to recreate with Keywords).

    As Google has added more and more information it gets less and less useful. Plus the rise in SEO makes it even harder to find what I need (But I find lots of useless stuff that people have paid to get put in front of my eyes). Of course it probably isn't in Google's best interest to help me locate information that I need in the most efficient way. The more I have to sort through the crap they now deliver the more ad revenue they generate.

    Too bad Bing sucks. I would really appreciate and alternative to Google.

    1. Re:And yet Google adds less and less to my .... by KrugalSausage · · Score: 2, Interesting

      You just haven't adapted along with it. Use search modifiers and your problems will be solved.

    2. Re:And yet Google adds less and less to my .... by Anonymous Coward · · Score: 2, Insightful

      wrong. they don't pay for showing ads, they pay if YOU click ads.

      if they serve you with crappy results, the advertisement targeted is going to suck.

      on the other hand, if they provide accurate results, there is a chance the ads being shown are interested for you.
      you don't think google is efficient or helpful?

      go one week not using it and then decide if google is not making you more productive.

    3. Re:And yet Google adds less and less to my .... by eulernet · · Score: 4, Interesting

      Use Google CodeSearch, it's more adapted to developers:

      http://google.com/codesearch

    4. Re:And yet Google adds less and less to my .... by Dishevel · · Score: 2, Insightful

      It is in Googles best interest to give you the best search results. That is how they got big. They can only sell your eyes if you are using them.

      --
      Why is it so hard to only have politicians for a few years, then have them go away?
    5. Re:And yet Google adds less and less to my .... by bendodge · · Score: 2, Insightful

      IMO, real product reviews are hard to find because of SEO. Everything else he mentioned I have no problem with.

      --
      The government can't save you.
  11. Re:That's a hundred petabytes of storage by dingen · · Score: 5, Informative

    They've developed their own.

    --
    Pretty good is actually pretty bad.
  12. Re:Caffeine?! by bsDaemon · · Score: 2, Insightful

    The only way that OS X would catch up to Windows in terms of market share, is if either A) they dramatically dropped the price point for Macs, or B) they licensed the software for white-box PCs. In either case, their brand would be diluted. They sort of thrive on a high-margin, low-volume model, and I'm not sure they were ever really competing with Microsoft in the way people imagine, especially being primarily a hardware company from the start.

  13. Re:Caffeine?! by Rockoon · · Score: 2, Interesting

    A hardware company generally does not compete with a software company.

    Apple has a long standing friendly relationship with Microsoft. They even turned to Microsoft to bail them out of a big financial mess not so many years ago.

    yes, this is contrary to Apples television advertisements... but those arent reality.

    --
    "His name was James Damore."
  14. 32 Google indexer visits this month by Anonymous Coward · · Score: 5, Interesting

    Google has pulled my site robots.txt file 32 times this month and it is only the 9th - about 4 times a day. I'm showing almost 2000 web pages pulled by Google indexers in this same time period. My site is tiny, private, not very large.

    By bandwidth, Google is only 2.4% of the total site traffic, so far, this month.

    I agree Google is "fresher" than they used to be. OTOH, my non-commercial site has approximately doubled readers in each of the last 6 months by publishing 1 new posting about every other day.

    I suspect other, more use sites are hit hourly or even more often by google.

    MSN-Bot appears to visit 10 times a day, but is much more selective about which pages it indexes. Since my site is date organized, this seems smarter than what google does. Some times, I do edit older stories with new knowledge or corrections which google will see, eventually and MSN will not. Zero referrals from any microsoft searches seen.

    Yahoo! slurp barely touches my site. Only 1 referral has been seen.

    Google sends about 30% of the total traffic, but most is from social networking with "hey, check this out" type referrals. Not bad for a technical article site.

  15. Is this new? by Brad1138 · · Score: 3, Interesting

    For a hwile now I have been noticing my forum posts being indexed within hours of making the post. It's been doing this for a couple years I think.

    --
    If you could reason with religious people, there would be no religious people
  16. Re:Caffeine?! by Anonymous Coward · · Score: 2, Informative

    What is a 'price point'? Stop emulating how marketing tells you to speak you dipshit. A price is a 'point' by definition.

  17. Re:Does it run on Linux? by asserted · · Score: 2, Interesting

    AFAIK java is in heavy use at google

    java is in heavy use at google but in other places - there is no java involved in serving a search query. with search, it's c++ all the way down.

  18. Google Dance by GameMaster · · Score: 5, Funny

    Google dance if you want to,
    If it helps you search online.
    MSN don't dance,
    and if they don't dance,
    well they're no search engine of mine.

    --

    Rules of Conduct:
    #1 - The DM is always right.
    #2 - If the DM is wrong, see rule #1
  19. Just A Minor Rant by BigBlueOx · · Score: 4, Funny

    Ok, what is it with people who write about technical subjects that they think they have to use ridiculous analogies?

    "if this were a pile of paper it would grow three miles taller every second"?? Yes, and if this was a goat it would have a thousand young. WTF. This was a Google blog post, not some story-for-the-terminally-stupid from The Daily Show ferchrissakes. The author even measures storage capacity in the universally used miles-of-iPods.

    What is the sound of one vein popping?

  20. Re:Caffeine?! by nacturation · · Score: 2, Insightful

    Calling a Mac a PC is disingenuous much in the same way as calling a cordless phone a mobile phone. Yes, your cordless phone is mobile in the technical sense, but common usage has given the words distinct meanings. Mobile no longer only refers to the fact that it enables mobility, and PC no longer only refers to the fact that it's your own personal computer rather than a server or mainframe.

    You: "Hey man, I got a new PC the other day."
    Friend: "Cool, dude! What kind did you get?"
    You: "An iPhone."
    Friend: "Uh..."

    Yeah, technically the iPhone is a personal computer. Just don't tell your friends or they'll think you're off your rocker.

    --
    Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
  21. It probably wasn't really Google than indexed you by TravisO · · Score: 2, Insightful

    You do know many spam/exploit bots use your robots file to look for admin logins or sensitive info. Just because the browser agent was the same as Google doesn't mean it really was, you have to check the agent's IP to be reasonably sure it's legit. Considering that Google even says they have previously only indexed sites every 10 days, it's much more likely you have 3 Google indexes and 29 exploit scans.