What's Wacky with Google?

← Back to Stories (view on slashdot.org)

Posted by jamie on Monday October 6, 2003 @01:36AM from the figure-it-out dept.

There are always going to be oddities with any big online service, but this one seems to be persisting. Join the discussion in trying to figure out a pattern. For maybe a week, Google has been returning zero results or "1-1 of about xxx,000" for common searches. One-word searches seem unaffected, but there are certain two-word combinations of common words like candle truck or speaker bracelet. Reversing the order can affect searches too: motorcycle candles vs. candles motorcycle. The strange thing is that usually the 1 or 2 results found are to commerce sites. Read the Search Basics, compare your notes to GoogleWhack's, have fun looking for patterns, but remember that Google always returns slightly different results for different IP numbers.

(Update: 13:56 GMT by J : When I first posted this story it said the problems have been occurring "for several weeks at least" -- but it seems to be more like one week.)

10 of 619 comments (clear)

Min score:

Reason:

Sort:

The same words in quotes show more hits ... by media_Assassin · 2003-10-06 01:41 · Score: 5, Interesting

Check out this - all 25 hits on the quoted words "candle truck" should be showing up in the non-quoted search ...
groups/deja is also acting up by Sabalon · 2003-10-06 01:41 · Score: 4, Interesting

for a few weeks, when I do a search on google groups, it'll come back with the results just fine - but when I click on the View Thread on a result, it tells me it can't display the thread and gives me a link to view that individual message. Then once that message comes up, I click on View Thread on that message, and up pops the whole thread, like it should have before.

Perhaps being on the top is getting to their CPU's :)
1. Re:groups/deja is also acting up by larien · 2003-10-06 01:58 · Score: 4, Interesting
  
  I've been getting oddities there as well, although it usually just doesn't show anything. A reload usually shows the thread correctly. I wrote it down to busy servers or some other transient fault; perhaps there's a larger fault somewhere in Google? I certainly hope not.
  Another oddity has been that threads have been stated as having "1 post", but viewing the thread shows a larger thread.
Another thing - what triggers the calculator? by fizbin · 2003-10-06 01:48 · Score: 4, Interesting

I realized the other day that although searching for 13 - 867 - 5309 causes google to go into calculator mode, searching for 123 - 867 - 5309 does not cause google to use calculator mode.

All sorts of odd things will both pull up an answer from google's calculator and also do a search - for example, searching for avogadros number or hbar.

So why do searches that might fit US telephone conventions not trigger calculator? Is it because some design decision makes it impossible to trigger both calculator and their phone lookup service. (Yes kids, google is a reverse phone directory, albeit with old data)
What's wrong with this picture? by tom.allender · 2003-10-06 01:48 · Score: 5, Interesting

"q=site:www.google.com google" - (third result)

This is what I'm seeing...
http://www.sminkybang.com/google.png
Canuck Ok by Malicious · 2003-10-06 01:51 · Score: 5, Interesting

For any who are interested, Google.ca is behaving correctly. All search results listed (that I've tried so far) from googlewack.com are working properly and returning 1-1 of 1, or displaying as they should.
I wish I could compare to google.com, but for the past year or so, google.com automatically forwards all canadian IP's to google.ca

--
01101001001000000110000101101101001000000110001001 10000101110100011011010110000101101110
General idea: by pr0ntab · 2003-10-06 02:26 · Score: 4, Interesting

google uses tons of DB entries to cross-index pages. I wonder if there's some simple hash-tables per page that it uses internally to speed things up that makes assumptions, and doesn't resolve collisions.

So you can search for one thing, and conceivably the checksum/hashes for each term match those of another page that has nothing to do with it, and it's returned as a relevant match by accident.

This might explain a lot of result sillyness.

--
Fuck Beta. Fuck Dice
Gator and Zuvio by YeOldeGnurd · 2003-10-06 02:39 · Score: 4, Interesting

I have run into some bizarre results lately. Recently I was trying to figure out what the NT 4 process "ESSERVER.EXE" did, and google's top search result sent me to a page at (DON'T GO HERE!!!)MamuFilms.com which actually redirects to "Armbender.com", a site that won't show you any pages unless you install "Page Access", actually Zuvio nastyware.

Here's Googles somewhat hilarious cache of the Mamufilms.com page. The page includes links for everything from "Peter Paul and Mary mp3" to "preteen bra images". The text is vaguely reminiscent of actual gramatical English. Here's one sentence:

And With Unknown virtual gifts Already baby food coupons to Information Installed The 2000, with Himself, to other tips, tricks, and tweaks The Issue De Processes services.exe.

--
...Nothing interesting here. Just move along...
Real information by fozzylyon · 2003-10-06 04:12 · Score: 5, Interesting

I spoke with a friend who helps maintain the google engine. She said that they were running into some problems with a "cleaning agent." Because of all the sites taking advantage of the word revelancy, there are useless sites that simply have a list of words or phrases. It's been posted before that there are many pages designed for GATOR/GAIN spreading or other spyware/adware. She quoted the percentage of junk pages being at 35% to 40%. The cleaning agent was supposed to run through its own searches and check for junk and keep a log.

She didn't say if the problem was that the cleaning agent was clogging searches or if any logged junk pages had been blocked. If so maybe the agent is flawed. In any case, they've stopped using it for the time being.
Strange counts for five weeks now by Everyman · 2003-10-06 04:46 · Score: 5, Interesting

The counts have been broken for the last five weeks. A count for the word "the" produced fairly consistent results until then of about 3.4 billion. Then it shifted five weeks ago to 5.2 billion. Lately it has been under 2 billion. Now it's just over 2 billion.

Webmasters who have various directories and know exactly how many pages are in each directory, began noticing five weeks ago that Google was reporting approximately twice the number of pages in each directory than have ever existed in that directory. Prior to five weeks ago, Google used to be fairly close to the actual number (assuming that you get a full crawl).

GoogleWatch speculates on the reason why Google has been behaving strangely ever since it stopped doing the traditional deep crawl once per month. The last standard deep crawl was in April but it wasn't used -- Google threw out this data (by their own admission) and reverted to earlier data. The speculative piece was written last June.

Since it was written, Google has started showing "supplemental results" on many searches. It looks like they are running a parallel index. Why would they do this? All the problems Google has been having, along with the supplemental index, seem to support GoogleWatch's theory.