What's Wacky with Google?
There are always going to be oddities with any big online service, but this one seems to be persisting. Join the discussion in trying to figure out a pattern. For maybe a week, Google has been returning zero results or "1-1 of about xxx,000" for common searches. One-word searches seem unaffected, but there are certain two-word combinations of common words like
candle truck
or
speaker bracelet.
Reversing the order can affect searches too:
motorcycle candles
vs.
candles motorcycle.
The strange thing is that usually the 1 or 2 results found are to commerce sites. Read the
Search Basics,
compare your notes to
GoogleWhack's,
have fun looking for patterns, but remember that Google always returns slightly different results for different IP numbers.
(Update: 13:56 GMT by J : When I first posted this story it said the problems have been occurring "for several weeks at least" -- but it seems to be more like one week.)
Candle Truck?
I am so glad someone else noticed this!!! I've been so pissed I haven't been able to get any speaker bracelets recently. God google... forcing me to use other search engines to get my fix.
SkyNet is becoming self-aware.
When I type my name in google it says, "did you mean Dark McBride?
-- Darl
It's just a glitch in The Matrix, of course.
I'd rather go and find the phrases that make google go freak out. I'm off to go try Dinosaur Cerebellum
Look it's a joke about my sig IN MY SIG! LOL!
...of that point in time where people were trying to come up with two word searches that resulted in exactly one result.
The company I was with at the time must have lost a few hundred man hours of productivity to THAT little fad.
Xentax
You shouldn't verb words.
It still cannot do phrase searches:
"to be or not to be" produces a 20% error rate on the first page of hits.
Someone told me that this is OK, since Google is producing pages that are linked FROM pages containing "to be or not to be", instead of pages actually containing the phrase. What a cockamamy way to run a search engine. Altavista, a thing of the past, had its problems, but at least it could do phrase searches accurately.
I also keep getting searches where Google tells me that it could not be bothered to produce correct results, so it excluded certain words from the sentence, and I have to try again with a + in front of the words. Well, Google, I wanted those words in the first place, which is why I included them in the phrase.
Is 100% accurate matching results to a phrase search too much to ask for a search engine?
This story was only posted to throw off the search counters for individual users on google...wasn't it?? ;)
What possesses someone to try such weird random words in google. Thats the real trick...google wrote an engine to amuse the crazy users.
Unless Google is purposely doing this (which I highly highly doubt), this is typically called a bug...
If this is as widepsread as it seems to be, then it could be pretty bad. Testing for bugs is always difficult (and a pain), but I'm sure that testing new releases of the google search engine is very hard, especially for peculiar issues like this one.
Anyway, that's my 5 centimes.
Maan
That's why you can't trust google for anything critical. You are at their mercy, and if they choose to do biased, or screwed up searches, you either don't know, or can't do anything about it...
I propose an opensource web based search engine... No more weirdness, no more screwups, no more censorship!
---
Programming is like sex... Make one mistake and support it the rest of your life.
I am sure the next Google Zeitgeist will show numerous searches for candle truck or speaker bracelet in October 2003. And nobody at Google will have an explanation for this ;-)
Check out this - all 25 hits on the quoted words "candle truck" should be showing up in the non-quoted search ...
Maybe it has something to do with the counter that was meantioned in a slashdot post earlyer today?
This sig was generated by a barrel of trained kittens for SeXy_Red (550409).
for a few weeks, when I do a search on google groups, it'll come back with the results just fine - but when I click on the View Thread on a result, it tells me it can't display the thread and gives me a link to view that individual message. Then once that message comes up, I click on View Thread on that message, and up pops the whole thread, like it should have before.
:)
Perhaps being on the top is getting to their CPU's
For further reference, see George Carlin.
"As soon as I shove this hot poker..."
The coolest voice ever.
While I was checking out the links that were in this story, the story disappeared and then reappeared on the slashdot front page. Very odd. There must be a conspiracy afoot. I think we should spend a large amount of time dicussing this possiblity and trying to find other oddities on slashdot that might clue us in to what they are really trying to do with this site.
Has anyone else noticed that the "spam" sort of sites that are nothing but link farms and Gator popups are getting much better at finding their way into Google's rankings? I switched to Google back in the day after search engines like altavista became overrun with such sites. Now I've noticed that they occasionally creep into their rankings...I guess entropy is the way of the universe after all.
Search for truck candle and you get 4 results. Including the candle truck result.
Soon, someone will see all the Slashdotters' queries for candle truck and start a business... destined to go the way of Enron, no doubt.
Don't trust those candletruckers - they don't play fair.
Big Red Candle Truck
back up the candle truck
The hardware store was forced to borrow a Colonial Candle truck
Not wanting to kill anybody, we wait until
the last two guys wander up to the candle truck.
scented candle truck accessories
yankee candle truck part
The coolest voice ever.
Bike Doc's Biker Java site has now been hopelessly googledotted as millions of potential novelty motorcyle-shaped candle owners are redirected towards an innocent vendor of coffee.
What gives you the right, Google? What gives you the right?!
Porn: Roughly 5 000 000 hits on google
Google: Roughly 70 000 000 hits on google.
Goolge is bigger than pr0n? Hmm, something _must_ be wrong.
GAAH! MY PRINTER IS ON FIRE!!! PUT IT OUT! PUT IT OUT!
It's the hardware and bandwidth. As soon as an OC-3 is less than $8500/month I'll have one running to my house. Until then it's back to the drawing board.
Speaking of wackyness... I read this article, followed one of the links, hit the back button and the story was gone... 5 min later it was back, followed another link, hit the back button and the story had changed. Hit refresh again, and it changed yet again!
Help Brendan pay off his student loans
I realized the other day that although searching for 13 - 867 - 5309 causes google to go into calculator mode, searching for 123 - 867 - 5309 does not cause google to use calculator mode.
All sorts of odd things will both pull up an answer from google's calculator and also do a search - for example, searching for avogadros number or hbar.
So why do searches that might fit US telephone conventions not trigger calculator? Is it because some design decision makes it impossible to trigger both calculator and their phone lookup service. (Yes kids, google is a reverse phone directory, albeit with old data)
"q=site:www.google.com google" - (third result)
This is what I'm seeing...
http://www.sminkybang.com/google.png
By the way, for info on Google's purchase of the search engine Kaltix, check this controversial Register piece by Andrew Orlowski. It contains the highly suspect, matter-of-fact comment that "PageRank is now widely acknowledged to be broken," but if you take the PageRank speculation with a grain of salt it's an interesting read.
should produce about 50% error rate or we are really in trouble ;-)
and it seems to work fine ...
I always put phrase searches in quotes.
Links 8 and 10 in the results might be useful, but they do not contain the exact phrase I was searching for.
It gave me 1-4 out of 123,000. Weird.
They say the first thing to go is your penis. Well, it's either that or your brain. I forget which...
For any who are interested, Google.ca is behaving correctly. All search results listed (that I've tried so far) from googlewack.com are working properly and returning 1-1 of 1, or displaying as they should.
I wish I could compare to google.com, but for the past year or so, google.com automatically forwards all canadian IP's to google.ca
0110100100100000011000010110110100100000011000100
Nope. Their running out of Pigeons able to compute/peck complex searches.
On the other hand, when the internet gets slower, Google will probably start acting strange.
I know personally when i've been searching google of late for things like home improvement how to's and the like such as bathtub refinishing it is linking to TONS of commercial sites selling products and service but hardly any online howto's or guides. Granted I realize maybe there just isn't much content for these topics but google seems to be selling out more and more to commercial links. I've also notice this although not nearly as much in looking for other things more and more and some of the searches are for things listings etc which could not likely have a commercial equivalent or likely reason to be on a commercial page.
Does anybody else see the story change? I'm getting two different versions if I reload. One with the additional lines:
"The order of words matters also, with motorcycle candle revealing different results to candle motorcycle."
"Read the Search Basics, compare your notes to GoogleWhack's"
and one without.
Complete text of the two versions are:
"There are always going to be oddities with any big online service, but this one seems to be persisting. Join the discussion in trying to figure out a pattern. For several weeks at least, Google has been returning zero results or "1-1 of about xxx,000" for common searches. One-word searches seem unaffected, but certain two-word combinations of common words like candle truck or speaker bracelet are affected. The strange thing is that usually the 1 or 2 results found are to commerce sites. Have fun looking for patterns but remember that Google always returns slightly different results for different IP numbers."
and
"There are always going to be oddities with any big online service, but this one seems to be persisting. Join the discussion in trying to figure out a pattern. For several weeks at least, Google has been returning zero results or "1-1 of about xxx,000" for common searches. One-word searches seem unaffected, but there are certain two-word combinations of common words like candle truck or speaker bracelet. Reversing the order can affect searches too: motorcycle candles vs. candles motorcycle. The strange thing is that usually the 1 or 2 results found are to commerce sites. Read the Search Basics, compare your notes to GoogleWhack's, have fun looking for patterns, but remember that Google always returns slightly different results for different IP numbers."
Strange.
Comment removed based on user account deletion
Ok, now I'm a guy who deals with audio equipment on a regular basis. This, of course, includes speakers. I have never, ever, heard of a speaker bracelet, and can't imagine why one would search for it.
Now this isn't to say that these people havn't perhaps discovered an interesting bug in Google, but trying to play it as a conspiracy for "common" search terms is bullshit. The terms listed are things that no normal person would EVER search for. Hell, they are terms that even someone involved with one of the terms would never search for. Bracelets have nothing to do with speakers. If Google was truly trying to push advertisers, well, they'd be doing a shitty job of it since only geeks with too much time on their hands would discover such things.
Give it a rest, the world is not out to get you. It's either a bug, or Google having some fun (something they are known to do). They are certinaly not trying to pimp a certian manufacturer of speaker bracelets, since such a thing is something that noone would know about, care about or want to own.
For regular searches, Google continues to work great.
I propose an opensource web based search engine... No more weirdness, no more screwups, no more censorship!
Given the commercial pressure on web search in general (Verisign, anyone?), the development of a working Open Source search engine is an absolutely critical task right now.
Even though I guess you will see *more* weirdness for quite some time, and i don't think anything Google has done so far is exactly "censorship".
My next comment will be ready soon, but moderators can beat the rush and mod it up early.
Not wanting to kill anybody, we wait until the last two guys wander up to the candle truck.
I prefer not to even click on that one, and just speculate.
The coolest voice ever.
"the" oh and I did "and" too. ;)
It's broke. Just put a sign on it and someone call the super.
Strange women lying in ponds distributing swords is no basis for a system of government.
what if such OSS search engine is massively distributed?
Since by its nature search engine is not a transactional application, it can be effectively broken into thousands and thousands of semi independent pieces (just like real Google works now).
Anyone aware of Distributed Open Source Powered-by-people search engine project?
groups.google.com was partially broken for most of last week... searches worked, but the links on the results page didn't. Browsing wasn't much better, many groups didn't even load.
The two bad results are:
"2Bee or Nottoobee"
"tobeornottobe"
I spelled the words the way I wanted in the search, and placed the spaces where I wanted them. Is it too much to ask to have it search for what I asked for?
This is just one example: I have the sloppy results mess up my searches for other useful phrases (such as computer error messages) all the time.
Mwahhahahah!
1. Register speakerbracelet.com
2. Be the top 1 of 2 search results on google.
3. ????
4. Profit!
Rocket science is easy. Neurosurgery, now *that's* difficult.
Yes, I've been getting this fairly often.
I've found that if you go ahead and view the article, then from the article click 'view thread' it will then show you the thread.
Ender
Nothing to see here
I've read that there's a real time search monitor in the lobby of Google's HQ. The nastiest words are removed, but other than that you can se exactly what people are searching for.
They have to be pretty confused right now, when thousands of searches for speaker bracelets, motorcycle candles and candle trucks show up on the display!
Martin
that Google is made by geeks for geeks.
The correct search phrase would be:
"0x2b||!0x$2b"
Wikipaedia seems to be fairly decent at organizing and presenting knowledge.
P2P is functional and varied.
Other forms of distributed / collaborative computuing are coming along nicely.
Why not ?
A.
The bugs only appear when you ask for phrases. I've never had a problem with Google when I ask for an exact word.
Go ahead and search on "cadnle".
It recomments a search for "Candle", but it does produce accurate results.
yeah i get that too
searching google for stone dog quote returns no results. Also try stone cat quote or changing the order of the words for weird results. Queries on alltheweb or altavista return numerous results, as expected. This has been reported in threads in alt.usage.english, rec.puzzles and (of all places) alt.fan.tolkien.
What a cockamamy way to run a search engine.
You are kidding, right? There's a reason that Google is by far the most popular search engine on the web, and it's got a lot to do with the "cockamamy" way it's run.
Perhaps you prefer the good old days when you'd have to check half a dozen search engines and trawl through countless useless links until you found something that was useful.
There are a handful of websites that should be in everyone's bookmarks. Top of the list is Google. Nuff said.
Oh, and as several people will have mentioned by now, and as Google's FAQ surely does, putting your search parameter in quotes will give you exact phrase results. This is pretty standard amongst all search engines, so it's amazing that you don't know this already.
Either you're new to the web and search engines in general or you haven't got a clue how to use one. Regardless, if you're going to comment on how "cockamamy" Google is, you should at least have an idea of how to use it first.
"Accept that some days you are the pigeon, and some days you are the statue." - David Brent, Wernham Hogg
My website www.blackapology.com has been sitting in google at the third spot for a while now. Last night, I kid ye not, it dissappeared completely, I emailed google about this, but have yet to hear a response. This morning however, its back in the search again.
... this same bug appears to affect my site ..
...
what is more
black apology
where as the other way round, it disappears from the results completely
apology black
weird huh ?
nick
Electronic Music Made Using Linux http://soundcloud.com/polyp
If you misspell and search for candles motorcylcle, with an extra letter 'l', it returns 15 results. It also suggests "Did you mean: candles motorcycle" that returns no results.
Yeah, bandwidth and hardware are rather limiting in building an large search service. There is Nutch, a project to start an open source search engine.
Until that gets off the ground, if you're woried about Google, you can use different searches as well. Someone like Hotbot lets you chose the engine from the standard search page.
Really, with all the different engines out there, it's not like you have to use Google, it's just been the best for relevant results for a while.
"What if they're using IE?" "I've dumbed Mozilla down to cope with it." - BOFH
The same result when searching all the web, "... from about 1020" when only searching on German web-sites.
Lars T.
To the guy who modded me down from perfect to terrible Karma - Apple haters still suck
This isn't impressive. Slashdotting Google would be impressive!!!
"Entering the phrase "to be or not to be" -- with quotes, yielded the first two pages of results all having that phrase"
I am using google.com
I am using the default search on the first screen
I am entering "to be or not to be" (with double-quotes entered in the search box) as the phrase.
Two of the results on the first page of results do not contain the phrase.
---------
If the results are different for us, perhaps this is another bug of inconsistency.
"Oh, I get it. You don't like the idea you need to actually construct a reasonable search phrase. "
Including "shakespeare" is not relevant: I was not looking for pages with that word.
I was looking for pages containing a certain specified phrase.
It should be pretty cut and dried, but the results are like if you looked for a date in an SQL string that was between 01/15/2001 and 01/20/2002, and the results were mostly OK, but you hade some 20/01/2001's scattered in the results.
Weird. Very weird. Adding another word to a search should narrow down the result set, not widen it.
Try it.
Flourescent (adj): smelling like ground wheat.
Does anyone else notice that it returns 1 of 1 of about 109,000? Where are the other about 108,999 results?
n/t
I did enclose the phrase in double-quotes when doing the search. In fact, you will see the double quotes in the top parent posting about this Google bug.
Soooooooooo... you're talking about a P2P search engine? ;-) Can't help ya.
jeez, it's a search engine.
Even more annoyingly, it sometimes only shows a SUBSET of threads when you click on the conversation: I've had numerous times where it'll show that a certain thread has 16 or so messages, and when I select to expand that thread it only shows a single message or 3-4 of them.
Truly annoying.
.....in the Matrix? Have we found it?
"Perhaps you prefer the good old days when you'd have to check half a dozen search engines and trawl through countless useless links until you found something that was useful."
Actually, Altavista always produced useful results: I never had to ignore erroneous returns like I do with Google.
However, Google produces so many more results, even if you take out the bad ones, so I use it mostly now.
"putting your search parameter in quotes will give you exact phrase results."
Let's move past this ok? I put the phrase in quotes in the search. You will see these in the top parent item.
"Either you're new to the web"
No, you just made the mistake of ignoring the quotes in the "to be or not to be" phrase I mentioned in the parent item. They were there, I did type them in the parent (and in the searches I am referring to).
I propose an opensource web based search engine... No more weirdness, no more screwups, no more censorship!
I'm all for it, but who's going to be the first to pony up the server space? (Hint: nobody)
If there is an oddity with Google why the fuck don't you e-mail Google NOT fucking Slashdot.
Btw, why the fuck is this newsworthy again???
I think it's far more interesting to see how large a result one can return from a combination of really different words. I believe the maximum # of words you can search on is 10...
It's also fun to do a "related search dead end", where you click the top-most unvisited "related links" link, until all the results returned have visited "related links" links. It returns some wild stuff eventually.
stuff |
Your second search is formatted like a U.S. phone number, so I'd say going into calculator mode when searching for phone numbers would be a misdesign...
I personally do it once in a while. For example, when I try to find lyrics, proverbs or famous quotes.
.. a month or two ago, a friend's hyperagressive cat was prescribed an antidepressant(!). I was curious so I did a google search on "feline paxil" and got very low quality and repetitive search results; most of the top few screens appeared to be related scams by online pill-pushers trying to get you to use their "search engine".
Perhaps some of google's anti-spamming countermeasures have backfired?
For even more fun, use the following script to generate two random words:
/usr/share/dict/words` /usr/share/dict/words`
(watch for word wrap)
#!/bin/sh
#
dl=`wc -l
RND=`date '+%H%S%d%M'`
RND1=`date '+%y%S'`
RND=`expr $RND + $RND1`
bilge=`expr $RND + $RND + $RND + $RND + $RND + $RND`
dw1=`expr $RND % $dl`
dw2=`expr $bilge % $dl`
echo `sed -e ${dw1}p -e ${dw2}p -e d
So far, "pectoral undaunted", "adjudicates battlefield", "numerous quark" and "camouflaged todays" work as expected in google.
We recommend that you read the story before posting. All articles are optional.
Obviously, Google has to do a lot of acrobatics to keep its service as fast as possible. One of the things it does is distributing its database over a lot of servers. There is no way that they can dynamically sift through hundreds of millions of pages for each common word, so they obviously just look at the top pages for each word. Which pages are top is probably determined by pagerank or something similar.
When you do this, there is no guarantee that you will get hits for every single combination of words out there. However, it may very well be possible to calculate the probability of relevant results not showing up and using this measure to make a more or less optimal trade-off between response time and user satisfaction.
When you start tweaking this trade-off, certain queries are bound to get screwed up. It probably takes them some time to notice this behavior, gather statistics and re-tweak their formula.
Another thing that crossed my mind recently is that they might be using precooked phrases or word collocations instead of single words. This makes sense since they use an implicit AND operator, it improves statistics and words are often strongly correlated anyway so your vocabulary probably wouldn't swell as much as you'd expect.
Mind you, this is pure speculation. I don't have any intimate knowledge about Google's inner workings.
Being well balanced is overrated. -- John Carmack
" You should post your exact search, and what exactly you're searching for,"
Refer to parent item. The search was for "to be or not to be"
(unless you are blind: some of you are. there are double-quotes around the phrase even if you do not see them).
I'm merely looking for pages that contain these words, spelled as I specified them.
"The bit about which pages are shown is a little backwards."
The of the results varies, but the first 10 results always contain 2 links to pages that do not contain the phrase. They contain something similar to the phrase, but they do not contain the phrase.
Maybe they are doing testing to ensure the service will be in the usual Microsoft standards in case they are bought.
Slashdot Sig. version 0.1alpha. Use at your own risk.
And how long will it be before some search-engine-optimization villains start running privately-hacked nodes to bias the results towards their clients?
.sig: be the majority of voters.
Remainder of my
Cowboy up!
Save Maine's economy: write stuff down. All comments are exclusively my own, not my employer.
Did you use the "quote" marks? Try it without them.
A few days ago I searched for "kazaa lite" on Google and found that no results are censored! The main KaZaA Lite page was the 1st result. That was only temporarily, of course, because right now the search is still censored.
Future Wiki -- If you don't think about the future, you cannot have one.
I've noticed there's been a lot of tainting of the Google results lately as well.
For example, if you search for "GroupWise cbt" (Novell GroupWise Computer Based Training), a lot of really interesting results come up.
This has been happening with enough frequency lately that I've been pondering switching search engines...
- Bunny
For once, I can actually click an embedded link within the story and the webserver hasn't already puked on the /. effect!
10 MD
which 50% is erred?
"If anyone needs me, I'm in the angry dome."
I wonder when this slashdot discussion will show up in the search results.
Come on already. 2 stories on the frontpage in one day, and still no Icon/Topic for Google? It's slashdot's favorite search engine. Every time something new, different, interesting, pointless, etc. happens with Google, we get a story on it. If the freaking Matrix movie has its own topic/icon, why not Google?
Overrated / Underrated : Moderation
I used to use motorcycle candles but later found that an electric headlight is much brighter and doesn't blow out when I move.
Outdoor digital photography, mostly in New Engl
Yes. Both Altavista and Google require quotes for phrase searches (some search engines like Hotbot did not require them in the past, made them harder to use as a result).
The difference is that Altavista produced 100% relevant/accurate results, while Google produces results that are mostly OK.
... the beginning of the end. Be prepared.
I've been getting an issue for weeks now where I'll do a newsgroup search, click on a match's "Show Thread" link and get an error that the thread isn't available. If you go back and try again, it works. Annoying, but not life threatening, at any rate...
Is it just coincidence that it's been since Sitefinder went active? Could it be that it broke googles broken link code?
Try: candle truck -ebay I now have 90 hits out of 82,600!
"Candle truck" returns 109k matches, while "fuck truck" returns 631k.
Upon hearing this news, I will be refashioning all stock in my candle shop as dildos, as there appear to be at least 6 times as many fuckers in the world as I previously thought.
"For example, I searched for "to be or not to be" phrase origin , and got what I consider to be useful results. "
I think you did not bother to look past the first result. The 3rd and 4th result do not contain anything like "to be or not to be". They do, however, contain "phrase origin". You just succeeded in pointing out the bug in another way.
When all is said and done, nothing changes...
...is that a search for VB.NET does not return any results either unless you perform an "Exact Phrase" search.
... and the crazy users wrote scripts to use the Google engine!
(shameless self plug) Its surprising what sites can appear when querying Google. Try my site that queries Google with random words to find random webpages. Its quite powerful and a good timewaster.
It seems to me all the results for candle truck and other weird ones also have exactly one (and the same) hit in the google directory. Strangely, the google images does give more results.
Research is what I'm doing when I don't know what I'm doing.
It should have some kind of semi-automated way to filter out such villains (like slashdot does :-)
" At the risk of making Google look bad, decent search engines automatically add quotes to common phrases."
That is far worse than producing mostly-accurate results. The decision of whether or not to treat a search as an exact phrase or as a group of words that can be scattered in the document should be left to the user. What you describe would produce very inaccurate results:
If I want to search for any document containing mojo and rising, and I enter
mojo rising
with no quotes in the engine, it is a bug if it decides to put quotes that I never asked for around the phrase and drop off all results that do not contain the words right next to each other.
google uses tons of DB entries to cross-index pages. I wonder if there's some simple hash-tables per page that it uses internally to speed things up that makes assumptions, and doesn't resolve collisions.
So you can search for one thing, and conceivably the checksum/hashes for each term match those of another page that has nothing to do with it, and it's returned as a relevant match by accident.
This might explain a lot of result sillyness.
Fuck Beta. Fuck Dice
I think this is another corporate play to improve sales. Now it's the candle truck and speaker bracelets technology. Tomorrow it can be Micro$oft squating all over google search terms. We must be aware. Capitalism is slowly taking over and destroy our favorite search engine, like it has been doing to so much nice things... Let's join efforts and destroy capitalism!
Mod parent down. His example works even worse, with results on the first page not containing "to be or not to be" in any form. He came up with a cool example but did not bother to check the results.
So why am I now being passed the full search parameters, user browser information, and presumably user IP?
Actual entry from log:
202.8.224.133 - - [17/Sep/2003:01:03:12 -0600] "GET /files/backup_tools.html HTTP/1.0" 404 287 "http://www.google.com/search?hl=en&lr=&ie=ISO-885 9-1&q=sample+backup+policy" "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"
That IP address isn't Google, and if I recall my HTTP protocol correctly, this isn't a request I should be getting.
I'm getting the same thing from msn.com's search engine. Anyone else out there seeing this?
Maybe of the 133000 results for candle truck, only a handful have some minimum Page Rank value? Google may have a threshold below which it doesn't return the page at all.
Maybe the rest of the results are machine generated crap; google could be filtering out some of those nasty page webs you get with certain searches (try searching for "equifax coupon"). Funny that they would still report it in the result count though.
Has anyone considered this little piece of malware as the cause of their troubles?
Just a thought.
Mmm... nebulous beeeeeee.
"You're saying an 80% success rate, for a phrase that's both ubiquitous and composed entirely of very common words, isn't good enough??"
Yup. It definitely lowers the bar, considering that altavista and other search engines are capable of 100% accuracy in search results.
Also, the "common word" distinction of Google is a major flaw.
Imagine what things would be like if the SQL query language produced results where 1 out of 5 were bad.
"I think I can live with both of those "false" leads."
Even if they are bad? Actually, I end up living with them too, due to Google's other advantages. But I wish they would fix this bug of sloppiness.
I just did a google search for lamp scapula and got 666 results. Obviously there is something inherently evil in lamp scapula. All good Christians should expunge any phrase that combines those two words from their vocabulary immediately.
Man, no wonder... You need to turn Safe Search OFF when you look up nasty stuff like that.
Wonder what Google has to say about this?
By the way, have you yet checked your own example
"to be or not to be" phrase origin
You were quite happy with results on the first page that only contained phrase origin and nothing like "to be or not to be".
Look at the HTML source. Around the results for the Candle Truck, they start several tables that never get closed, including the TR's and TD's.
Although I don't think it's specifically mentioned anywhere in Google's help, you can actually use a * as a wildcard in a quoted search.
For example, "i * you so" returns hits with phrases like "i love you so" and "i told you so".
Now that you have this wonderful knowledge, keep it secret...keep it safe.
it is a bug if it decides to put quotes that I never asked for around the phrase
It's a feature, which you can turn off in alltheweb.com's preferences. It is turned on initially because most web users don't know as much about how to work a search engine as the typical Slashdot user knows.
Will I retire or break 10K?
Here's Googles somewhat hilarious cache of the Mamufilms.com page. The page includes links for everything from "Peter Paul and Mary mp3" to "preteen bra images". The text is vaguely reminiscent of actual gramatical English. Here's one sentence:
...Nothing interesting here. Just move along...
Google Golf
Rules: Type in any word that starts with the selected letters and try to get as few returns as possible without getting a zero.
... is that some sites are managing to defeat the google ranking system, by building very long URL's with all possible keywords for a given subject.
Try searching "ringtones downloads", as an example.
-------------------------------------------- Se você consegue ler aqui então fala português. Óbvio
If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
...welcome our new speaker-bracelet-wearing, motorcycle-candle-burning overlords.
When everybody on the planet with a computer and a phone line started using Google, it became only a matter of time before it turned into something slick, restrictive and deliberately manipulative.
-FL
You have obviously never needed the services of a candle truck, or you wouldn't ask such a dumb question.
As said obviously nobody searches for candle trucks or speaker bracelets. It could just be some whacky combinations engineers used for testing and debugging. More likely just an odd combination of algorithm action, because they are so dissimilar the intersection of the two search fields comes out as 1 or if there's semantic analysis it's just falling over.
http://www.google.com/search?hl=en&lr=&ie=ISO-8859 -1&q=php - a search for php, returns a result for press awards for google.
This is a new feature -- Google has to stay ahead of MSN /somehow/! ;)
10b||~10b -- aah, what a question!
"I submit that this is generally NOT the case -- that usually, the user is searching for something ABOUT a phrase "
That is kind of odd. In years of searching, I have never searched that way. When I look for a phrase on a search engine, I am looking for pages that contain the phrase.
[compare Google results to Altavista]
Once you get past Altavista's blasted ads, it had 100% relevant results on the first page. Pretty good and accurate compared to Google's mere 80%.
"your criteria of JUST "pages with the phrase" is a peculiar and unusual example, but hey, maybe that's just me"
Maybe it is just you. Everyone I know searches like I do as well. This is how it is supposed to work as described in Google's own help.
"I'd usually include a term or two to give context of WHY i'm searching for a phrase"
I don't want to have to explain myself as per a damn sr high essay question. I just want accurate results for my searches.
Perform a search for candle truck -site:.com and you get 44,200 results....while putting -site:gormetcollection.com (the only hit I get for just candle truck) yields 0 results
.. is to feminine. How about we call it the speaker brocelet?
Live web cams
Google reps is going to be on campus (Purdue University) today for a technical presentation. If there are any questions you want me to ask them reply to this thread. I'm 99% sure I'll be able to attend.
"Bug? How do you algorithmically weigh this stuff in a way that can determine which of the search terms is so important that it should be in all of the initial results to the exclusion of the others?"
Bug? Yes, a big one too.
If you specify 3 words in a search, an accurate engine will only produce pages with those 3 words. If it decides to broaden the search, it should put the non-matching results at the end.
You asked for a 3-word search (your first phrase is counted as a word in search engine logic). Results 3 and 4 did not even contain this first word.
"How do you algorithmically weigh this stuff in a way that can determine which of the search terms is so important that it should be in all of the initial results to the exclusion of the others?"
It's simple boolean logic. If the text contains the search criteria, include it. If it does not, do not include it. This is not rocket science here.
If I do a search on
one two three
(not a phrase: no quotes!) I expect the results to all contain those 3 words. Sure, it can prioritize for the pages where the words are together (or something else), but why is it too much to ask for to have results that contain what I am looking for?
...they're trying to cache two-word searches.
And aren't doing too well at it.
From what I recall and the way things seem to work now, the + operator has been changed slightly.
By default, words like "to" "with" and "by" are not included in a search because they are deemed too common. However, I used to be able to force inclusion of those common words using the + operator.
Now it seems that this is no longer possible. That is, the strings one +to another and one +for another give the same results (without quotes). In fact, the + is replaced by a space in the above queries and that definitely didn't used to be the case.
Does anyone else remember the good ole' + operator?
The link that contains "tobeornottobe" is a glitch too: my search was for 6 words, not this single word produced by getting rid of the spaces. Granted, it is less of a glitch than the other one: at least the spelling is correct.
I'm sure they've got several real world Agent Smith types tracking down the errors.
"Googles searches against indexes of cached pages, not the actual pages as they may be as of today"
No, that is not either. For the two errorenous results, Google contains the contents of its cache which led it to mistakenly include the matches (look at the page summaries). These cache contents show the problem regardless of what the page has now.
If you search for Candle Truck you get the single result. If you search for Candle Truck -basket, that excludes the one result and leaves you with... a working search with 80k results.
Yay for broken indexes
I'm getting more and more crap in my results. Information overload and too many commerical web sites trying to sell me crap are getting in my results. I find myself using Ask Jeeves more and more, probably only cuz I bought in at $2. Go use it and make me more money!
Maybe Google has an old pentium in their server farm somwhere. I think it was called the repentium or something and, given the right calculation to do, resulted in spectacular errors. Maybe certain texts refer it to this incorrect entry in the multiplication table? Yea right, like they would be using a pentuim 60! Then again, I don't remember paying anything for the service...
Try using $RANDOM
A good chunk of the spam I've been getting lately has had very little content outside of obfuscated HTTP references. What content it has appears to be random two and three word combinations, apparently put there to try to cheat bayesian filtering.
Since I use bayesian filtering, I can say it must work once in a while because I do get false negatives like this.
Probably the same urge that makes people want to see which famous old movies Pink Floyd albums synchronize with.
Schnapple
"Remember that google is not trying to be pedantic, its trying to be USEFUL."
Yet, it makes itself less useful by coming with with erroneous results
It's taking your search terms or phrase and returning what it thinks are the pages most likely to satisfy your request."
Yet, I ask for A and it gives B. That is not brilliant at all. I'd rather have a search engine produce accurate results to my search, and not use some poor guesswork to come up with things that I did not ask for that it THINKS I might want.
"I still don't know why people bring up historical search engines in comparison to google."
Because these engines produce better results. The problem is that these historic search engines are now too small and Google produces a lot more results, even if they are not good at the searching.
"I too think it sucks that you can't open the window on the airplane."
Yet, if you turn the window knob, you expect the window to open. You don't expect the seat to go back because someone THINKS that is what you wanted instead.
"Most of the complaints boil down to sour grapes: for the record"
No, the complaint boils down to the fact that Google is good, but it would be perfect if it corrected these bugs which produce unwanted and unmatching results to searches.
(Update: 13:56 GMT by J: When I first posted this story it said the problems have been occurring "for several weeks at least" -- but it seems to be more like one week.)
Actually, I've been seeing this problem occasionally for over a year. It just seems that larger numbers of search terms trigger it now.
Of course, I can't remember any of the search terms that have triggered it in the past--I've just learned to change my terms slightly to get around the problem.
Dee
Like one of those things where there are 109000 pages vaguely relevant to 'candle truck' but some kind of rating system is saying that only 1 should be displayed, and the other 108999 are just garbage and you're probably not interested.
(or they're spam/pr0n/whatever)
allintext: "to be or not to be"
In google's search box (or go to advanced search like you should have when you didn't get the results you expected). I don't think it's too much to expect of a person who wishes to use a search engine to bother to actually learn how to use a search engine.
Should I stay in first gear when I ride my buddy's bike that has the shift levers on the frame instead of the handlebars (and I didn't expect it), or should I find them and change gears?
The "showing results 1-1 of xxx,xxx" thing is odd. Getting results you don't expect because you didn't ask right is trivial and common.
"Murphy was an optimist" - O'Toole's commentary on Murphy's Law
My Firebird has a Google search bar built in, and on the rare occasions I need to break out Explorer, I have the Google bar there to quash the stream of pop-ups. And also to help with the distributed computing on the folding problem, actually, but mostly for the popups.
Philip Sandifer's academic website
Search 1: 186,000 posts for mandrake in the last 6 months
Search 2: 186,000 posts for mandrake in the last 3 months
Change the search string and still the same issue.
"I don't really know what the obsession with 100% results is."
So, have you found a new job since the layoffs at "Arther Anderson"?
Results came back that were commerce sites?
:P
Not surprising, considering the big cesspool shopping mall the internet has become.
Google searches use unique and proprietary algorithms to find the most useful information for the search terms. We all know this, it is their "page rank" system. But perhaps the page rank system is driven by more modifiers than we are aware of. For instance, In Minnesota, Twins and Vikings mean a couple of sports teams, in Norway, they probably mean something entirely different so perhaps "Page Rank" does some regionalization. In the same vein, it may be possible that if I refine my search from Minnesota by adding the word "Gopher" to the Twins and Vikings, I may get more, rather than fewer results while perhaps in Norway I'd get no results!
In addition to possibly regionalizing searches, perhaps Google's servers are not updated with the latest code at the same time. Maybe the code is distributed over time to servers so that if a problem were discovered it could be more easily rolled back. It is possible that the load balancing on these servers uses some component of the IP address or somehow regionalizes the incomming requests so that it is likely that the same user usually gets to server A but sometimes goes to server B while their co-surfer neighbor usually goes to server B but sometimes goes to server C. Meanwhile, a couple of states away, another user usually connects to server W but sometimes connects to server X. This could explain why they usually but not always get the same results but someone else gets different results.
" Google doesn't do simplistic phrase matching. If it did, it'd be the same (and as useless) as altavista" Google does relevancy searches
No, it would be more useful by being as useful as Altavista. Altavista might have fewer results, but they are 100% accurate and thus relevant. Google's bad results are less relevant since I never asked for them. Non-matching results are not relevant unless they are specifically asked for.
"obeornottobe.com is relevent to a search for "to be or not to be"."
No, it is not, since I never wanted it. Sloppy logic is a flaw, not a strength. Sure, if I search on "Bill Clinton", you might similarly argue that "Al Gore" is "Relevant" and thus pollute the results with Gore results that never mention Clinton...
It's very simple. You take the words "candle" and "truck" and translate the characters to base-10 numbers by using the respective ASCII codes. When you add up the numbers in the word "candle", you get 615. 6+1+5=12. 1+2=3. Do the same thing with truck and you get 553. 5+5+3=13. 1+3=4. Then finally you have 3 and 4, which is 34 and not 42. And because it is not 42, or the answer to Life, the Universe and Everything, Google flips out.
I believe that this has also something to do with Schroedinger's Cat, but have not been able to determine the common denominator...
i know what your problem is (excuse me if someone else already posted this, i didn't read all the posts). you're using the old query syntax. the syntax for a query on google used to be search?q=term+term... but since they've become international (at least i think that's the reason) the syntax has become much longer (i know, i used to just type http://www.google.com/search?q=term+term into my browser to... well i don't know why. just to seem geeky i guess). now it includes options for stuff like language and it's just too long to remember. so my guess is that when you use the old syntax you're only searching the sites that haven't been completely catalogged... or something like that.
No more pretty interface!
Have you looked at Google lately?.;)
Is this a sigs-optional kind of place? 'Cause I am totally down with that if you know what I mean.
"If you want to see only pages that have the phrase "to be or not to be" in the text, type:allintext: "to be or not to be"
That did not work. The 10th link was bad (did not contain phrase).
"I don't think it's too much to expect of a person who wishes to use a search engine to bother to actually learn how to use a search engine."
Putting quotes around a phrase and getting exact relevant results is a standard.
"Should I stay in first gear when I ride my buddy's bike..."
Heh. Using the bike example, if I apply the brakes, I don't expect the bike to go faster!
"Getting results you don't expect because you didn't ask right is trivial and common"
I did ask correctly. Your alternate example still did not produce accurate results.
A search of Linux on Slashdot returns only one entry..
Hax.
http://www.haxwell.org
A moment ago, I entered "to be or not to be" without the quotes, and hit "I'm feeling lucky".
I got www.gnu.org.
In reality you can search for anything within quotes and get what you where looking for. (e.g. ' "candles" "motorcycles" ' ) That's what I've always done on google. It's a more effective search.
"No, you asked for ~A, and it gave you ~A. If you had gone through the trouble of figuring out how to actually ask for what you want, "
No, I wanted pages containing "to be or not to be" and I used Google's instructions to ask for it.
"Well, then you can rest assured that Google is perfect"
No, AltaVista is perfect when it comes to phrase searches. Google has trouble getting relevant results even on the first page of returns.
Your "allintext"-using example failed as well. For all your argument, you have yet to come up with a way to get A = A search results!
This exists. It is called URLBlaze.
It searches your history for URLs and shares them with other people.
Currently, it's oriented towards sharing large files, not HTML. However, if you're interested, email Ran, the project's overlord. Urlblaz Contact info.
The project is not open source. But they've been open to new ideas so far. If there's interest, there might be potential.
Just for reference, google.com.au is behaving just the same as google.com
:)
Must be run by the Australian government... we seem to follow all the U.S. cock-ups, although we're normally a good 5-10 years behind
... and then there were none
According to the many who have defended the Google bugs here, the results obviously must have included the Beach Boys because Google is smart, and Google figured that you really DID want "Beach Boys" even though you you told it to exclude it.
Traits that Altavista had where such your search results would have actually excluded the "Beach Boys" are considered to "suck" and are part of the stone-age of search engines.
The pigeons are getting tired.
"Tell me doctor, with all of your defenses, are there any provisions for an attack by killer bees?"
"vb.net" isn't a URL. "http://vb.net" is a URL. Still, the blame does lie with MS for misappropriating a common naming scheme.
How aware are the search engines of each other? These returns are pages found searching in these search engine databases for themselves and others. Google and Yahoo rather casually mention the number of pages returned, prefaced with "... of about ..."; while AltaVista and Lycos are considerably more anal about reporting quantitative findings.
Google AltaVista Lycos Yahoo
Google 93,000,000 5,817,435 22,483,511 24,300,000
AltaVista 2,050,000 1,821,362 9,179,642 3,090,000
Lycos 18,500,000 2,309,191 11,215,263 6,950,000
Yahoo 95,300,000 10,284,666 55,680,102 38,400,000
e.g. Lycos found 22,483,511 pages mentioning Google, while only about half that many mention itself (Lycos). Perhaps this leads to poor search engine self-esteem issues.
Further exercises in pointless database introversion are left to the reader.
It is acknowledged among the SEO (Search engine optimization) community that PageRank, while still in effect, is being greatly de-emphasized by google in calculating the final score. (A page's relevance is PageRank * Non-PageRank-Factors) The problem was that blogs were getting extremely high pageranks due to their interlinking convention, and other sites were getting high pageranks by creating a bunch of fake links to them. And although people rarely discussed it, there was the problem of the "rich get richer" syndrome, where people would link to pages that were high on the results, increasing the score of those pages. It is very difficult for new websites to get recognized because of this problem. So as a result google has increased the importance of anchor text relative to pagerank. (It has decreased the relevance of blogs but it's hard to say if it's helped with the other problems)
Ok, Cancles Motorcycles returns nothing, but Candles on Motorcycles returns 35,400 matches. And I'm not using + or "" to include on. On is supposidly excluded from the search but it brings up 35,400 matches instead of 0.... :)
"on" is a very common word and was not included in your search... it certainly effects the search though
= 9J =
I spoke with a friend who helps maintain the google engine. She said that they were running into some problems with a "cleaning agent." Because of all the sites taking advantage of the word revelancy, there are useless sites that simply have a list of words or phrases. It's been posted before that there are many pages designed for GATOR/GAIN spreading or other spyware/adware. She quoted the percentage of junk pages being at 35% to 40%. The cleaning agent was supposed to run through its own searches and check for junk and keep a log.
She didn't say if the problem was that the cleaning agent was clogging searches or if any logged junk pages had been blocked. If so maybe the agent is flawed. In any case, they've stopped using it for the time being.
Returns 9140 hits. Hmmm.
Best Slashdot Co
append -porn to your search to get all the motorcycle candles you want, except the pornographic ones.
2*31*37*263
I bet I know what the problem is. Now that they have, in their arrogance, disclosed certain key details about the Google File System, the global HACKER conspiracy has managed to infiltrate their mainframe and bring it to its knees!
Verily, I say, security through obscurity and security through obscurity alone will save you. Google should never have given those dirty hackers the keys to their kingdom.
Or maybe not...
One might ask the same about birds. What ARE birds? We just don't know.
At the risk of making you look bad, for phrase searches you have to put the phrase in quotes.
Phrases are one thing, but try searching for something like "-no." Can't do it without getting thousands of occurrances of the word "no." Doesn't matter if you quote it-- Google edits out symbols it appears, and there are some things pretty damn hard to find because of it...
select * from a inner join b on (a.somecolumn = b.somecolumn)
would return no recrods with some some database engines where as changing the order of evaluation
select * from a inner join b on (b.somecolumn = a.somecolumn)
would return the records you wanted even though the operations are logically equivelant. I always speculated that it was something with the query optimizer that messed things up or there was some rule in sql that I wasnt aware of.
And the fact that some sites like .ca sight doesnt have the problem but the usa domain could simply be difference in release levels of "google" on various servers and domains.
Different issue, but annoyingly enough my employer's firewall IP address has been banned somehow from using Google Groups. Other Google services are OK, and emails to the address provided have not helped. Any ideas on why this is done, and how to get it reversed?
Ummmm your site doesn't seem to work (in Mozilla at least).
:)
Without frames, clicking the cat gets us "Invalid URL".
With frames, the left and right "Mango images" do not appear. When I found them with my mouse (it seems they're a bad url, but you can still click where they should be), I still get an error:
Your query failed. Try again.
If it keeps failing:
1. You entered an incorrect Google key. If you don't have one, leave it blank (or as the default, 'key: none')
2. 1000 searches have already been used for the day -- come back tomorrow or get a free license key from Google.
3. Try using only 1 or 2 words combined with the country/language options.
4. Look at the search history to see the most recent random webpages served.
5. Google may be down. Try again later.
6. Try using the archived version.
Reading them, and looking at the venue, I'm going to suggest it's 2
Last post!
Searching for candles motorcycle returns no matches, however, searching for candles motorcycles returns about 51,900.
I'm the urban spaceman babe, but here comes the twist... I don't exist
From the cache: These terms only appear in links pointing to this page: google com
so how do you find the pages with the links?
Here's another one off: adobe.com on www.google.com gives only one search result... in finnish
[Fuck Beta]
o0t!
Under IE, I got the same results the other responder did as well (most likely you need an upgraded license key?).
Neat sounding site, though!
I feel fantastic, and I'm still alive.
The whole reason I changed to Google from whatever I was using before (probably AltaVista) was NOT their ad-minimalist approach (though that is appreciated) but was the fact that if you enter a series of terms that results in no hits, you'd get a NO HITS message, not the hits for the next closest subset of terms. Fortunately, it currently still works that way-- try "candle truck speaker bracelet" (without the quotes) and you'll get the "no matches" result you (presumably) SHOULD get. Also I've noted that many other engines have since followed suit (though interestingly, AltaVista finds some 8000 hits on this and most of the top ones appear to have all four terms from word randomization pages), perhaps because they realize the importance of the feature. If Google does a VeriSign and starts giving you goofball hits rather than no hits, I'm jumping ship...
Here's my theory: Google's been hacked by Microsoft's minions, trying to discredit them, and to get searches for Linux to point to:
here
Newsgroups seem to be disappearing and re-appearing like mad this last couple of weeks.
Plus the number of posts indicators seem to have been reset or something.
See comp.sys.acorn.* for an example.
#include <sig.h>
When searching for some term and clicking the search-button several times, you will allways get "Search took" times of less then 0.1 seconds.
With the strange terms, you allways stay above 0.2 seconds.
The counts have been broken for the last five weeks. A count for the word "the" produced fairly consistent results until then of about 3.4 billion. Then it shifted five weeks ago to 5.2 billion. Lately it has been under 2 billion. Now it's just over 2 billion.
Webmasters who have various directories and know exactly how many pages are in each directory, began noticing five weeks ago that Google was reporting approximately twice the number of pages in each directory than have ever existed in that directory. Prior to five weeks ago, Google used to be fairly close to the actual number (assuming that you get a full crawl).
GoogleWatch speculates on the reason why Google has been behaving strangely ever since it stopped doing the traditional deep crawl once per month. The last standard deep crawl was in April but it wasn't used -- Google threw out this data (by their own admission) and reverted to earlier data. The speculative piece was written last June.
Since it was written, Google has started showing "supplemental results" on many searches. It looks like they are running a parallel index. Why would they do this? All the problems Google has been having, along with the supplemental index, seem to support GoogleWatch's theory.
This is odd... I think the thousands place in the 102,000 number is somehow related to the fact that it returns 2 results in stead of 1. Try it here http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF -8&q=candle+and+truck&btnG=Google+Search.
Your name is a Radiohead allusion, and I claim my 5.
I tried these "wacky" queries on alltheweb and didn't get any wackiness there. However, I did get a bunch of results that seem to indicate that there is really no such thing as "speaker bracelet" or "candle truck". Still, there are many pages that contain both words of these pairs, so why wouldn't Google return them?
Page rank should only rank pages and not actually remove any pages. Assuming that Google is innocent, I'd say there is a bug in page ranking algorithm they use.
Maybe this is a bug in "offensive" filter? "Offensive" filter does remove results, right?
And now I am wondering if Google runs some secret filters to remove "trash" results (something that may cause me to stop using Google if that's true). Page ranking is one thing, but secret and totally opaque filtering process is another.
Get a fucking life would you.
One way they are doing this is by posting "link spam" to blog comment sections. I have found a bunch of these on my blog recently.
P hentermine[/a] and other Discount Prescription Drugs. In addition to weight loss medications, it also offers FDA approved men's health RX products such as [a href="http://www.cheap-viagra-online.spam/"]Viagra [/a] impotence treatments and Propecia male pattern baldness medications. In addition, we carry a full range of skin care products such as Renova wrinkle / stretch mark cream, plus Vaniqa facial hair remover for women. Also in our women's health section are Birth Control Pills. For men and women who smoke, we offer Zyban. But our main [a href="http://www.cheap-online-pharmacy.spam/"]onli ne pharmacy[/a] business concerns Slimming Pills such as generic manufacture Phentermine, branded Phen products such as Adipex tablets and Ionamin time release capsules. Also other appetite suppressants such as Bontril, Didrex, Meridia and Tenuate (AKA Diethyproprion and Dospan). Finally, we also provide the fat blocker Xenical (one of the best [a href="http://www.best-diet-pills-online.spam/"]Die t Pills[/a]), a different form of obesity treatment. For support and help, we suggest you use The Obesity Organization.
They find a blog that allows comments with links, and put in a meaningless comment that happens to contain a bunch of links to different sites.
Every one I have found so far has come from non-US IP's. I put in a minor trick to fool the more simple-minded automated ones and the frequency dropped dramatically, but in one case I saw (on the logs) a spammer come in, try an automated tool, and after it failed, a couple of minutes later post manually (or with a patched automated tool).
These people may end up destroying an important part of the blog world (and maybe Google) the same way they ruined Usenet.
I suspect Google will catch on to this and the result will be the devaluing of blog comment links, which would be a real shame.
Here is an example (with the links broken so I'm not helping the turkey):
I have a blogging good web site all about [a href="http://www.cheap-phentermine-online.spam/"]
The only good weather is bad weather.
One thing to consider is that google does seem to sort users by location. Being in Canada, I end up at google.ca, though I'm not sure if this alters my search results so much as it likely redirects me to a local, faster server?
It's a bug. No, really. I talked to Google via e-mail about this a few years ago when I first noticed it, and they said they were aware of the problem and were working on fixing it.
... presumptuous (though 'helpful' is the term I'd use), "
"Google's design premise, and one which MOST people like for MOST searches, is that it is NOT just a pattern-matcher...
I've never heard of anyone who searched like you do. I thought the great thing about Google was how it ranked results according to relevance (which this example has shown it is not very good at: false positive search results belong at the end of the search results, if anywhere, not on the first page!.
"Google (and really any modern search engine) try to find what you're looking for, which (more often than not) cannot be simply summed up with a few search terms.
As an experienced search engine user, I have no problem at all formulating searches for exactly what I want.
"If you want something less
Bogus results that are not relevant are not "helpful": they are noise.
Maybe this guy would tell you why one might search for candle truck. Next time you're having a tailgate party... why not have a candle tailgate party!
Duplicating search terms has an interesting result:
candle truck
1-1 of about 101,000
candle candle truck truck
1-1 of about 82,200
candle candle candle truck truck truck
1-1 of about 73,700
candle candle candle candle truck truck truck truck
1-1 of about 68,600
Another interesting one is
candle candle truck
1-2 of about 89,200
welcome our speaker bracelet overloads.
Then you'd get the google.com page, that would insantly redirect you to the google.ca page, that would resolve to the google.com page, that would redirect you to the google.ca page....
What is the robbing of a bank, compared to the founding of a bank? -- Bertolt Brecht
Code:
for i = 1 to 10
print i
next
Results (on one line): 1,2,3,4,5,7,8,9,10,11,12,Skippy,15,20,McKinley,66
(don't complain. The program was not returning what you asked for. It was returning what it thought you might want)
candle speaker comes up with nothing at all, while candle speaker two comes up with a heckuva lot..!
Whenever I hear someone say 'impossible', I can't help but translate it to 'completely-do-able-by-really-dedicated-minds' :)
We apologise for the fault in this post. Those responsible have been sacked. -- Signed RICHARD M. NIXON
Don't worry friends. Longhorn will guarantee accurate results when searching Speaker Bracelets and Candle Trucks. It will also stop other nasty search engines from giving any kind of wacky results via Step 3. -Bill
Once i searched for my site's URL and found it in the text of one of those penis pill sites.
-
"Both hits contain my phrase in the title, suggesting the phrase is important to the page content, "
Except that "to be or not to be" is found in the title of neither of the two erroneous links.
No, it wasn't good because it produced false results that did not match the search criteria.
"So one bad search, and now Google isn't good at searching?"
Lots of bad searches.
"Are you honestly claiming that some other engine (e.g. Altavista) has NOTHING but relevant results, and that you use Google just because it has more (but noisier) results?"
Yes. Altavista produces too few hits. Google produces a lot more, enough that the you can put up with the glitches. It could be even better: without the glitches. If crummy old AltaVista has a 100% accuracy rate, why not google?
"frost pist" - 45, "frosty piss" - 21.
OMG! Wau!
And yes, the query failed because of the search limit being reached. My counter was a little behind in automatically tripping the script to say 'limit reached' instead of 'query failed'. This is because I'm not entirely sure when Google resets the query limit to 0 again.
Usage has been steadily increasing so I should ask Google for an increase in my daily limit, in their FAQ they said they may do this for certain situations. The limiter is getting hit more often, especially when the link is posted to a popular website. You can always get your own Google key, it will work then!
" As yerricde said [slashdot.org], this is an optional feature that can be turned off in the preferences.
So Alltheweb will mangle your search criteria far worse than the Google problem unless you change the settings? The default should be "it works", and it should only garble the phrase if you CHOOSE to set the preferences.
really, doesn't it sound like there's just some bug where that sometimes, when there are sponsored results, it doesn't return the normal results? some hokey null termination issue, or other weird code problem. it doesn't seem all that unlikely a kind of bug.
Wow. This Google thing sounds really cool. Wish we would've had it back during the war.
(BTW--what's an internet?)
Read any good sonnets lately?
Comment removed based on user account deletion
Google is simply discouraging idiotic searches.
Strange things are afoot at the Circle-K.
Apologies, wildly offtopic but...
Damping factor, IIRC my audio days, is the ratio of the output impedance of a power amplifier to the impedance of the load. It matters that this is high to give the amp maximum control over the cone and dampen any inertial overshoot.
Of course, the load that counts is the speaker itself, so "output impedance of power amplifier" has to include not only the impedance to the output terminals but that of the cable too. Nominal df values of 1:1000 at the output terminals (fabulous) drop to 1:10 or typically 1:3 (grim) with a grotty bit of mains cable.
Enough secrets of the audio vault, I'm off to Google 'damping factor'.
OTOH, according to this search, "getting laid" seems to be a very popular topic at Slashdot.
I hate it when a FAQ about X (in this case, GoogleWhack) fails to answer the basic question, What is X?
So, that said -- what the heck is GoogleWhack?
Moderator hint: a comment is neither "Flamebait" nor "Troll" if it is true.
Someone should take the time to do an in depth
analysis.
Perform the same set of searches using proxy servers
from each country.
Mondo bizzarro so far for me.
" Except Google's results DID contain the search terms -- it's just that some of them weren't the literal search phrase"
2 of the 10 results did not contain the search terms. Search terms = literal search phrase.
Google's great; I rarely use AltaVista. But it could be better if the results were accurate. I'm tired of going through the results when I search on the exact text of a computer error and coming across results that don't have the error phrase I was looking for.
Also, when it comes to one or two word searches, Google is 100% accurate. It is just the sentences that make it stumble.
If the government counted votes the way Google did phrase searching, Pat Buchanan would be president now.
Your fingers then begin to vibrate in response to the electrical impulses.
If you then stick your finger in your ear, you can hear the local sports news.
Cool Hey!?
Send orders to me, and as a special marketing offer, our special Brooklyn Bridge salesmen will give you personal attention.
Did you know about 'phonebook:'? I didn't, until I read this book. It's an undocumented Google feature. Try it: "google for phonebook:hillary clinton ny"
Dr Superlove 300ml. I use my powers for awesome
searching for "site:www.google.com google" returns http://www.adobe.com/motion/main.html as 3rd link. It's not in www.google.com and it doesn't contain the word "google" in it. My self-confidence dropped by 50%. What can I believe in if even the google is broken?
-- mg
" believe what Google is actually searching for is "to be" OR "not to be","
According to their search tips, the proper syntax for what you describe is
"to be" OR "not to be"
A search for this produces a result page where 3 of the 10 do not contain the phrase "to be" or the phrase "not to be". Even if you follow their instructions, you still get a rather high percentage of bogus returns.
It says I've searched 3,307,998,701 times... oh, wait... never mind.
Half of the population is dumber than the median. It is unknown how many people are dumber than average.
-- HG Pennypacker, wealthy industrialist and philanthropist
Ze message has been received comrades. We will depart immediately on our "candle motorcycles" to meet up with the "candle truck". Our "speaker bracelets" are synchronized. The "yoyo jam" has been prepared according to your instructions. We wish you an "explosive christmas" on your mission. The infadels will never decipher our "volunteer glyceraldehyde".
There are always going to be oddities with any big online service
Hmm, like Slashdot?
This is not a troll or an offtopic post; I've noticed a wacky problem in Slashdot's "Slash" code for some time now (since the last major code rev, I think). Before that rev, the threshold-modifier select box used to work properly. Now, it seems to be miscalculating the number of posts in a given thread at each threshold level, e.g.:
-1: 49 comments
0: 49 comments
1: 49 comments
2: 49 comments
3: 49 comments
4: 42 comments
5: 34 comments
I find it very hard to believe that there are the same number of comments at levels -1 to 3 inclusive. Any Slash hackers care to comment?
We have more to fear from the bungling of the incompetent than from the machinations of the wicked.
Yeah, I recently was looking for the NATO phonetic alphabet and Googled for "hotel lima golf" as the first three things that popped into my head.
:)
Unfortunately, what came back was a bunch of pages advertising hotels with golf courses in Lima, Peru. So that technique doesn't work all the time.
Also,
This could explain the reason why they have deployed the new sematic feature now, maybe a bit too early, due to the preassure of this new wave of spammers. (i had also noticed very successfully ranked spam pages recently)
Google could be using a "simple" pseudo-semantic algorithm to block this sort of random pseudo-english spam. Measuring the probability that words happen at distance X after/before other words in "real english" google could hope to discover that the probability of having the sequence "virtual gifts Already baby" is zero in human communication.
To summarize my conspiracy theory: google is computing the entropy of the results and discarding the too-much-entropy-to-be-human-language ones.
Maybe is just wishful thinking. But it also can explain why the order matters in the search. Imagine you search for two common words that have a specific meaning when put together in a specific order. Well, google seems to realize that. For example search for "cell white" returns a white house press release on stem cells but "white cell" returns relevant results on medicine.
Sorry I can't remember a good example of semanticaly relevant search. That would be two words that are very common but when put in the same sentence (but not next to each other to make it more challenging) would define a very specific topic. I've had this problem several times: google returning irrelevant results and not knowing how to narrow my search beacause all the words are so common. I wish i could remember one to check if google results have improved...
OTH if my entropy-based google filter theory is correct maybe i should consider going to the business of selling tools to spammers
I think part of why changing the order of the words affects it is because Google doesn't search like other engines do. Other engines look for keywords only. Google looks for key words in certain combinations and sites that have a lot of links that match it's terms.
Marvin knew: "Think of a number, any number..."
I am sure Google has better things to use bandwidth on than being /.ed by a bunch of bored geeks. Go pound on someone else's servers; set my google free.
Gary Dunn
Open Slate Project
Sometimes you can tell that a person is about to say something interesting before they say it by cues in their facial expression. Perhaps these fuckups are cues that Google will IPO soon, and they are already out spending their checks. Once Google is beholden to fuckhead short-term thinking stockholders, kiss your high-quality "evil" hating google team goodbye.
LS
There is a fine line between being a cultivated citizen and being someone else's crop. - A. J. Patrick Liszkie
Lots of things are "popular" that most of the people here wouldn't consider good. Think about MS Windows. Boy bands. Country music. McDonalds (ok, so they have wireless salads now...).
You didn't even make a single claim as to why you think Google is good. You didn't respond to the poster at all, other than by pointing out how Google IS popular and SHOULD BE even more popular. Wow that makes me want to go out and google so I can be part of the in crowd.
How does this possibly get +4 Insightful? What is the insight???
"but since when is a search engine accurate
Altavista has no problem with accuracy. It returns search results that match 100% what the user asked for.
Google has a problem with this. Go ahead and search on "to be or not to be" (use the quotes!). On this phrase search, 2 of the 10 first results are bogus: non-matches (do not contain "to be or not to be").
Theroreticly couldnt someone destroy googlewhacking forever just by making two websites that list every english word in existance on one page?
) gives only one resault, but they wont accept supercalifragilisticexpialidocious :(
anyway I found (supercalifragilisticexpialidocious,googlewhacker
All misspellings and grammatical errors in the above post are intentional and part of my artistic expression.
I didn't think I'd ever run into this problem, but then a random conversation tonight led me to search for 'ods bodkins hammer and tongs'.
It also appears the E-bay has 'Discount Tongs' according to the ad on the right.
-R
... if in a company like google there aren't plenty of people that read slashdot.
I wonder if all this strangeness could be that yahoo will be dropping google listings soon so they are cleaning up their database.. who knows.. they also just bought a new company that specializes in search engine tech.. not that would effect anything yet.
Pocket Girls. Mobile Adult Mini Mags for your Phone.
returns 0 hits.
dog stone quote 1 -1 of 112,000
face it, google is broken. At some point they will explain what happened or it will get fixed.
Anyone checked the google labs site and seen if they are doing something odd?
I notice that some sites will rank totally different if you switch from singular to plural and visa versa.
Table-ized A.I.
The Australian google site (www.google.com.au) suffers from the same googlewhacks. However if you limit the search with the site:.au or Australia only search you get;
:)
Searched the web for candle truck site:.au.
Results 1 - 50 of about 1,060. Search took 0.29 seconds
The same is true for adding the limit and searching from www.google.com (except it is even faster at 0.18 seconds
yummy
Could it be that Verisign's Site Finder service is giving problems to the Google search engine?
Here's How:
Google indexes a page with a link to a non existant domain that expired last month. SiteFinder redirects the google spider to another, unrelated page. It gets indexed as being related to the initial page. The end result is the google index gets very messed up.
============
-Do Justly, Love Mercy, Walk Humbly with your God!
DouglasK Do Justly. Love Mercy. Walk humbly with your God.
On the other hand, if you actually searched for nato phonetic alphabet, the first link is exactly what you wanted. Strange, isn't it? ;)
See, I know you read my post because you were able to find the "reply" underneath it.
I also know that my post explicitly mentioned that one of the queries looked like a US phone number.
Look, I live in the US. I know what US phone numbers look like. What I asked was: "So why do searches that might fit US telephone conventions not trigger calculator?" Telling me that the second query looks somewhat like a US phone number, while true, is about as relevant as replying to remind me that I am posting in English.
It's not as though my post were long. It's not as though I obscured the font of the question making it harder to read.
And yet, this reply was not the only one that stated this (at least two other replies say essentially the same thing). So tell me, what am I missing in basic written communication that causes this misunderstanding? What causes this misunderstanding?
Note that this is my question: "what causes this misunderstanding?" I just want that to be clear.
"does have a URL matching the phrase (as well as the phrase contained in the body),"
Read more closely. The site does not contain "to be or not to be" at all.
"Given the tobeornottobe.com site contains the works of Shakesphere, that's probably the site I want. "
Except that it was not what I wanted or asked for on the search.
I will start by saying that I do not know how Google gets its results, but I have an idea on how this could happen. If the search engine first finds all the results of the first word, and next the search engine finds all the pages that contain the second word within the first results. You will get different results if you reverse the words using this method. On the other hand, if you find all the results from both words then find the interesection of the two results, you should get the same results--otherwise the search engine is messed up.
At the next eco-hypocrisy-meeting, count the private jets used to get to the meeting. Should be interesting to see that
Another funny search term... try "Google -Google" without quotes. Proof positive that, as my friend says, you can't take the google out of google.
" tobeornottobe.com contains the phrase I searched for, sans whitespace: valid hit."
I asked for it with whitespace specified. Returns that mangle the phrase I wanted are erroneous.
Besides, the way it handles this is quite inconsistent. I looked for "free cheese" and freecheese. None of the results for "Free cheese" had the word freecheese.
Interestingly enough, three of the 10 returns on "free cheese" did not contain that phrase.
"because if I'm searching for a quote or stanza, I would want to be directed to the source."
Yet, I was not looking for such a "source". If I had been, I would have specified the search differently.
"and getting meaningful results from searches more difficult for users."
However, making sure that the results contain the phrase asked for is not rocket science.
goddamn bitches forgettin' the motherfuckin' magic an' shit