Slashdot Mirror


Google's Bigger Index

WebGangsta writes "Google Inc. today announced it expanded the breadth of its web index to more than 6 billion items. This innovation represents a milestone for Internet users, enabling quick and easy access to the world's largest collection of online information."

15 of 412 comments (clear)

  1. Their search has apparently improved as well ! by phoxix · · Score: 4, Informative

    Search for any normal product name with google. What would you used to get ? Billions of useless sites that cross link to each other and have the same bloody reviews from amazon.com

    That seems to have changed!

    I just tried a search on television antennas and for once the results seem relevent.

    Hooray!! Google is back!! :^)

    Sunny Dubey

  2. They said 6 billion items, not webpages. by LostCluster · · Score: 5, Informative

    Notice that they claim that they search 6 billion items, but the home page only claims that they're "Searching 4,285,199,774 web pages".

    To find the rest, we need to use Google's other services. The image search is claiming "Searching 880,000,000 images". Google Groups says its "Searching 845,000,000 messages". Add those to the count and you get 6,010,199,744 items total.

  3. Re:how many? by Anonymous Coward · · Score: 5, Informative
    That sort of search result spamming is getting out of hand.

    Maybe if more people used Google's Search Quality feedback form, it would help weed them out.

  4. Google Print by blorg · · Score: 5, Informative
    "Google's collection of 6 billion items comprises 4.28 billion web pages, 880 million images, 845 million Usenet messages, and a growing collection of book-related information pages."

    I was interested that they mentioned Google Print, which is Google's answer to Amazon's Search Inside feature, but hasn't got much press, and is pretty well hidden in Google itself.

    You can check it out by limiting results to site print.google.com, e.g. searchterm site:print.google.com. (Not quite at Amazon-type numbers yet.)

  5. Is /. pro Google? by dark-br · · Score: 5, Informative

    "Google currently does not allow outsiders to gain access to raw data because of privacy concerns. Searches are logged by time of day, originating I.P. address (information that can be used to link searches to a specific computer), and the sites on which the user clicked. People tell things to search engines that they would never talk about publicly -- Viagra, pregnancy scares, fraud, face lifts. What is interesting in the aggregate can seem an invasion of privacy if narrowed to an individual."


    That's a quote from the NYtimes (free req. yada yada) also posted as is here

    If any other site were to track the stuff Google does, /. would be up in arms protesting!

    Please note, this isn't a troll, and I'm not wearing a tin-foil hat (maybe I should?). Imagine the following scenario: a bomb goes off in the US. By tracing searches for "anarchist cookbook" to zipcodes within the area of the bomb blast, the FBI could have access to information that makes TIA look like a better alternative.

    Maybe this isn't such a good feature after all...

  6. Re:What I want to know... by ctishman · · Score: 5, Informative

    Use that "Dissatisfied with your search results? Help us improve." link at the bottom of the page. Voila.

  7. Re:What I want to know... by Chris+Croome · · Score: 3, Informative

    ...is how to get rid of those pseudo-pages in Google. The ones with names like "thing_that_youre_searching_for.html", and all they are is either a page of dead links to crap on ebay, or a "Hey, we do great searches for your stuff".

    +1

    There are things that you just can't use Google for any more becaues these googlespam sites score so well... it's like being back in the days before google...

    --
    Check out MKDoc a mod_perl CMS
  8. It's worth mentioning... by dark-br · · Score: 4, Informative
    that not everything about Google is so visible.

    One shuold have a look at Google-Watch (tinfoil? maybe...) but they have some good points:

    According to DEA, Google is breaking the law

    Google Evil cookie

    We got your number!

    And so on...

    Not to troll but rather a thought. Mod as you wish.

  9. Re:No Good... by glinden · · Score: 4, Informative
    • I want it to return more relevant searches.
    Have you tried some of the Google alternatives? Vivisimo is particularly interesting with its clustering of search results. Teoma is also quite good.
  10. big but far from complete. by selderrr · · Score: 4, Informative

    I wrote a project for our univ and submitted the url to google bout 3 moths ago. It still doesn't show up

  11. Mac users' image search by saddino · · Score: 4, Informative

    "Google Image Search has been significantly updated," said Sergey Brin, Google co-founder and president of Technology. "We've doubled the index to more than 880 million images, enhanced search quality, and improved the user interface."

    For Mac users, I recommend using Beholder to power your Google image search. Google's minimal UI changes notwithstanding.

    (Mod +1 Self-Promotive)

  12. Re:Still nok by bad-badtz-maru · · Score: 3, Informative

    If googlebot crawls your site, then your robots.txt file is either wrong or in the wrong location. There is no doubt that googlebot follows the robots.txt standard.

    It can take a very long time for a site to be spidered after it is submitted via the "add a url" form.

  13. Re:Here's hoping by thestarz · · Score: 3, Informative

    Yes, you are missing something. They have reached 6 billion items, only 4 billion of those are web pages, the rest are pictures, usenet messages, etc. RTFA!

    --

    c++; /* this makes c bigger but returns the old value */
  14. For those of you who were wondering/complaining... by Afromelonhead · · Score: 3, Informative
    According to Google's cache of Google, there used to be only 3,307,998,701 pages in their index, as opposed to the 4,285,199,774 (as of writing) in the index.

    It's also interesting to note that both have a copyright date of 2004, which would imply that Google has found just under 1 billion websites in a month and a half, which seems like an interesting fact.

    --
    Procrastination sucks.
  15. Even better way to report by delfstrom · · Score: 4, Informative
    The "help us improve" link is okay, but a little general. Most of us slashdot readers know when a search result is truly bogus, and there's a more advanced form we can use for reporting abusers directly:

    http://www.google.com/contact/spamreport.html

    This will give you options of reporting cloaked pages, doorway pages, deceptive redirects, misleading or repeated words, hidden text, etc. You have to be more specific than the "help us improve" link at the bottom of search results. Using this form I've seen abusive sites disappear from Google's index in less than 12 hours.